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Preface 



The discipline microbiology is researched actively, and the field is advancing 
continually. It is estimated that only about 1 % of all of the microbe species 
on earth have been studied. Rapid advances in molecular biology have 
revolutionized the study of microorganisms in the environment and improved 
our understanding of the composition, phytogeny, and physiology of micro- 
bial communities. The advent of molecular biology has offered a number of 
revolutionary new insights into the detection and enumeration of soilborne 
microorganisms. DNA sequences provide information on identifying 
unknown species from 16S and ITS rRNA sequences of individual bacterial 
and fungal species. Molecular methods monitor both pathogens and also 
beneficial organisms in soils for detection and quantification. The in-depth 
exploitation of PCR potential led to more sophisticated variants of the 
technique (improved even from the currently expanding real-time PCR) 
that increases the speed and sensitivity in microbial identification and diag- 
nostics. These molecular techniques provide new insights about their func- 
tions and interactions within ecological niches. 

Analyzing Microbes — Manual of Molecular Biology Techniques is a 
practical guide to the application of important molecular biology techniques 
in microbiological research. The chapters are written by a group of interna- 
tional scientists who are recognized authorities in their research areas from 
universities/researchers and often the new techniques that are described. 
These volumes are aimed for graduate, postgraduate, Ph.D. students, and 
laboratory technicians working in different biotechnology/microbiology 
laboratories. It is also valuable to the larger community of researchers who 
have recognized the potential of genomics research and may be beginning to 
explore the technologies involved. Moreover, the volumes are also targeted 
as handouts for students, teachers, and researchers world over. 

The central parts of the chapters are the experimental protocols which are 
presented so as to be readily used at the laboratory bench. Although a number 
of the procedures described represent the tried and trusted, we have striven to 
include variants on existing technologies that an experiment can be per- 
formed. These step-by-step protocols are intended to be concise and easy 
to follow. Suggestions to successfully apply the procedures are included, 
along with recommended materials and suppliers. A special feature of the 
chapters is that, in addition to the protocols, important background informa- 
tion and representative results of applying the methods are given. References 
are provided to enable the investigator to become better acquainted with 



V 




the topic. Researchers in any field that utilizes microbial systems will find 
this work of value. In addition to microbiology and bacteriology, this book 
highlights the current state-of-the-art molecular microbiology techniques in 
biotechnology, microbiology research, and environmental microbiology. 

The aim of the book Analyzing Microbes — Manual of Molecular Biology 
Techniques has been to produce a self-contained laboratory manual which 
will be useful to both experienced practitioners and beginners in the field. We 
hope that this book stimulates your creativity and wish you success in your 
experiments. 
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Chapter 1 



Microbial DNA Extraction, Purification, and Quantitation 

Sukutnar Mesapogu, Chandra Mouleswararao Jillepalli, 
and Dilip K. Arora 



Abstract 

Cell wall of microorganisms is broken by chemical or enzymatic lysis or a combination of both. Generally 
lysozyme is used to digest the rigid cell wall structure which has high amounts of lipid while detergent like 
SDS solubilizes the phospholipids in the cell membrane. EDTA destabilizes the cell envelope and 
deactivates the DNases by chelating with the magnesium ions in the membranes which are essential for 
integrity of cell envelope. Insoluble cell debris is removed via centrifiigation, leaving an upper aqueous 
suspension containing the DNA, proteins, and RNA. Purification of DNA from proteins can be achieved 
by various methods, generally by protease treatment, to hydrolyze the proteins resulting in water-soluble 
amino acids or shaking the aqueous suspension with phenol chloroform. The aqueous phenol emulsion is 
then separated by centrifugation. Proteins (having both hydrophobic as well as hydrophilic amino acid 
residues) get collected at the interphase. While RNA can be removed by RNase treatment, DNA can be 
concentrated by addition of ice-chilled ethanol or isopropanol and precipitated DNA is collected as pellet 
by centrifugation. This chapter describes the protocol to check the purity and quantify DNA. 



1.1 Introduction 



Isolation of genomic DNA from microorganisms has become a 
useful tool to determine the fates of selected microorganisms or 
recombinant genes and to reveal genotypic diversity and its 
change in microbial ecosystems. The protocols in this chapter 
provides a frame work for isolating high quality genomic DNA 
from a variety of organisms, including bacteria, plasmid DNA [1], 
actinomycetes, yeast [2, 3], and fungi [4]. All of these protocols 
yield high molecular weight (HMW) DNA, which remains of high 
quality (i.e., not degraded in to smaller fragments) for several 
years when stored as specified below. For each organism a specific 
procedure is provided for releasing free chromosomal DNA from 
its cellular or nuclear location. The first task in each of these 
protocols is the removal of cell wall that is typically lysed in an 
SDS solution containing sucrose. The released DNA is prevented 
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from degrading DNAses and other proteins by EDTA and pro- 
teinase -K respectively the cellular proteins [2]. 

The chromosomal DNA of the Escherichia coli is a large 
circular molecule of approximately 3.2 kb size. The DNA is 
attached to the plasma membrane at many points. Being large in 
size, DNA is prone to mechanical breakage. However, if extrac- 
tion is performed carefully, large fragments of chromosomal DNA 
can be obtained with an average length of 1-2 kb. The bacterial 
cell wall is enclosed in a cytoplasmic membrane and surrounded by 
a rigid cell wall. With some species, including E. coli^ the cell wall 
may itself be enveloped by second outer membrane. All of these 
barriers have to be disrupted to release the cell components. 
Techniques for breaking open bacterial cells can be divided in to 
physical methods, in which cells are disrupted by mechanical 
forces and chemical methods, where cell lysis is brought about 
by exposure to chemical agents that affect the integrity of the cell 
barriers. Chemical methods are most commonly used with bacte- 
rial cells when the object is DNA preparation. Chemical lysis 
generally involves one agent attacking the cell wall and another 
disrupting the cell membrane. The chemicals that are used depend 
on the species of bacterium involved, but with E. coli and released 
organisms, weakening of the cell wall is usually brought about by 
lysozyme, ethylenediamine tetraacetate (EDTA), or combination 
of both. Lysozyme is an enzyme that is present in egg white and in 
secretions such as tears and saliva, and which digests the polymeric 
compounds that give the cell wall its rigidity. On the other hand, 
EDTA removes magnesium ions that are essential for preserving 
the overall structure of the cell envelope, and also inhibits cellular 
enzymes that could degrade DNA. Under some conditions, weak- 
ening the cell wall with lysozyme or EDTA is sufficient to cause 
bacterial cells to burst, but usually a detergent such as sodium 
dodecyl sulfate (SDS) is also added [5]. Detergents aid the process 
of lysis by removing lipid molecules and thereby cause disruption 
of the cell membranes. Having lysed the cells, the final step in 
preparation of a cell extract is removal of insoluble cell debris. 
Component such as partially digested cell wall fractions can be 
pelleted by centrifugation, leaving the cell extract as a reasonably 
clear supernatant. Most protocols for the preparation of genomic 
DNA consist of lysis, followed by incubation with a nonspecific 
protease and a series of extractions prior to precipitation of the 
nucleic acids. Such procedures effectively remove contaminating 
proteins, but are not effective in removing exopolysaccharides 
which can interfere with the activity of enzymes such as restriction 
endonucleases and ligases. In this unit, however, the protease 
incubation is followed by a CTAB extraction whereby CTAB 
complexes with both polysaccharides and residual protein, 
effectively removing both in the subsequent emulsification and 
extraction. This procedure is effective in producing digestible 
chromosomal DNA from a variety of gram-negative bacteria, all 
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Fig. 1.1 General steps involved in genomic DNA isolation (Phase I: Cell lysis. Phase II: Protein degradation and 
precipitation. Phase III: DNA precipitation, Phase IV: RNA degradation and precipitation) 



of which normally produce large amounts of polysaccharides. The 
actinomycetes are gram-positive bacteria which have a character- 
istically high G -I- C content in their DNA (>55 %). Many species 
produce a wide variety of secondary metabolites, including anti- 
helminthic compounds, antitumor agents, and the majority of 
known antibiotics, which have been exploited by their use in 
medicine and agriculture. The actinomycetes were originally con- 
sidered to be an intermediate group between bacteria and fungi 
but now recognized as prokaryotic. The detailed protocol is 
represented as four phases as shown in Fig. 1 . 1 

It is based on the conformational difference between plasmid 
and chromosomal DNA of the bacteria. Plasmid molecules are 
double stranded, circular entities, and generally exist as covalently 
closed circular molecule (supercoiled form), while chromosomal 
DNA molecule exists as linear double stranded molecule. So use is 
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made of the fact that linear double stranded DNA is denatured by 
exposing to high pH values of the lysing solution (in the range of 
pH 12-12.5). On the other hand, the covalently closed circular 
plasmid DNA in supercoiled form is resistant to these conditions. 
So when the pH is lowered in further steps, renaturation takes 
place and plasmid renatures faster (if denatured) than the chro- 
mosomal DNA. Because the renaturation is done at cool temper- 
ature and also pH lowering is sharp, as a result the whole 
chromosomal DNA forms an insoluble clump (aggregate) because 
of the mis-matched base-pairing. The aggregate DNA can easily 
be separated by centrifugation, as plasmid remains in the superna- 
tant while aggregate chromosomal DNA forms pellet [5]. 

The need to adapt organic extraction methods to take account 
of the biochemical contents of different types of starting material 
has stimulated the search for DNA purifications methods that can 
be used with any species. This is one of the reasons why ion- 
exchange chromatography has become so popular. A similar 
method involves a compound called guanidinium thiocyanate, 
which has two properties that make it useful for DNA purification. 
First it denatures and dissolves all biochemicals other than nucleic 
acids and can therefore be used to release DNA virtually from any 
type of cell or tissue. Second, guanidinium thiocyanate allows 
DNA to bind tightly to silica particles. This provides an easy way 
of recovering the DNA from the denatured cell extracts [6]. One 
possibility is to add the silica directly to the cell extract but, as with 
the ion-exchange methods, it is more convenient. In addition to 
DNA, the cell extract contain significant quantities of protein and 
RNA. A variety of methods can be used to purify the DNA from 
this mixture. One approach is to treat mixture with reagents which 
degrade the contaminants, leaving a pure solution of DNA. The 
standard way to deproteinize a cell extract is to add phenol or a 1 : 1 
mixture of phenol and chloroform. These organic solutions pre- 
cipitate proteins but leave the nucleic acids (DNA and RNA) in an 
aqueous solution. The result is that if the cell extract is mixed 
gently with the solvent and the layers then separated by centrifu- 
gation, precipitated protein molecules are left as a white coagu- 
lated mass at the interface between the aqueous and organic layers. 
The aqueous solution of nucleic acids can then be removed with a 
pipette [7]. 

With some cell extract, the protein content is so great that a 
single phenol extraction is not sufficient to purify nucleic acids 
completely. This problem could be solved by carrying out several 
phenol extractions one after the other, but this is undesirable as 
each mixing and centrifugation step results in a certain amount of 
breakage of the DNA molecules. This can be solved by treating 
the cell extract with protease such as pronase or protease K before 
phenol extraction. These enzymes break polypeptides down into 
smaller units, which are more easily removed by phenol. Some 
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RNA molecules, especially mRNA, are removed by phenol treat- 
ment, but most remain with the DNA in the aqueous layer. The 
only effective way to remove the RNA is with the enzyme ribonu- 
clease, which rapidly degrades these molecules into ribonucleotide 
subunits. A second useful method is drop dialysis, which can 
remove salt, SDS, and even some enzyme inhibitors. 
As such, it can be used with many methods involving DNA 
purification before or after enzymatic reactions. DNA fragments 
larger than a few 100 base pairs can be separated from smaller 
fragments by chromatography on a size exclusion column such as 
Sephacryl S-500. To simplify this procedure, the following mini- 
spin column method has been developed [6]. For fragments from 
200 bp to 10 kb the agarose purification is ideal. For smaller 
fragments (20-400 bp), the acrylamide purification is preferred. 

Ultra violet (UV) spectrophotometry is most commonly used 
for the determination of DNA concentration. The resonance 
structure of pyrimidine and purines are responsible for these 
absorptions. The DNA has a maximum and minimum absorbance 
at 260 nm. However, these are strongly affected by the degree of 
base ionization and hence pH of the measuring medium. If at 
A260/A280 the purity of DNA is out of the 1. 8-2.0 range, then 
the DNA should be purified to remove contaminants. Absorbance 
measurements at wave lengths other than 260 nm are used for 
determination of DNA purity. The relevant spectrum for this 
purpose lies between 320 and 220 nm. Any absorbance at 
320 nm indicates contamination of particular nature. Proteins ab- 
sorb maximally 280 nm due to the presence of tyrosine, phenylal- 
anine, and tryptophan and absorption at this wavelength is used 
for detection of proteins in DNA samples. This is usually done by 
determination of the A260/A280 ratio [8]. 



1 .2 Materials 



1.2.1. Bacterial (E. coli) 
DNA Isolation 



1. Luria Bertani (LB) Broth 

2. TE buffer— 50 mM Tris, 50 mM EDTA (pH 8.0) 

3. Tris (pH 8.0)— 250 mM 

4. Lysozyme — 10 mg/ml 

5. SDS— 0.5 % 

6. EDTA— 0.4 M 

7. Proteinase K — 1 mg/ml 

8. Phenol equilibrated with Tris (Phenol is a hazardous organic 
solvent. Always use suitable laboratory gloves when handling 
phenol containing solutions. Specific waste procedures may be 
required for the disposal of phenol containing solutions.) 
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1.2.2. Gram - ve 
Bacterial DNA Isolation 
by CTAB Method 



1.2.3. Plasmid DNA 
Isolation 



9 . Sodium acetate (pH 5.8) 

10. Ethanol— 95 % 

11. RNase — 200 pg/ml 

12. Chloroform 

1. Nutrient broth — 25 ml 

2. Tris EDTA (pH 8.0) (10 mM Tris-Cl, 1 mM EDTA) 

3. 10 %SDS 

4. 20 mg/ml proteinase K 

5. 5 MNaCl 

6. CTAB/NaCl Solution — Dissolve 4.1 g NaCl in 30 ml water 
and slowly add 10 g cetyl trimethylammonium bromide 
(CTAB) while stirring. If necessary, heat to 65 °C. Adjust to 
100 ml 

7. Chloroform 

8 . Isoamyl Alcohol 

9. Buffered Phenol (8 -Hydroxy quinoline, Liquefied phenol 
redistilled, 50 mM Tris-Cl, pH 8.0, TE buffer pH 8.0): Add 

0.5 g of 8 -hydroxy quinoline to a 2 1 glass beaker. Gently add 
500 ml liquefied phenol (crystals of redistilled phenol melted 
in a 65 °C bath). The phenol will turn yellow due to the 8- 
hydroxyquinoline, which is added as an antioxidant. Add 
500 ml of 50 mM Tris base, cover with aluminum foil, and 
stir 10 min at low speed. Let phases separate at room temper- 
ature and gently decant the top (aqueous) phase into a suit- 
able waste receptacle. Remove any residual aqueous phase 
with a glass pipette. Repeat twice with 500 ml each of 
50 mM Tris-Cl, pH 8.0. Check pH of phenol with pH paper 
and repeat equilibration until pH =8.0 and store at 4 °C in 
brown glass bottles or in clear glass bottles wrapped in alumi- 
num foil 

10. Isopropanol 

11. 70 % (v/v Ethanol) 

1. Luria Bertani (LB) medium supplemented with proper 
antibiotic 

2. Lysis buffer 1 

(a) 25 mM Tris-Cl (pH 8.0) 

(b) 50 mM Glucose 

(c) 10 mM EDTA (pH 8.0) 

(d) 0.2 mg/ml RNase A 




1.2.4. Actinomycetes 
DMA Isolation 



1.2.5. Yeast DNA 
Isolation 
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3. Lysis buffer II (always freshly prepared) 

(a) 0.2NaOH 

(b) 1% (w/v) SDS 

4. Lysis buffer III 

(a) 3 M potassium acetate (pH 5.5) 

5. Chloroform: isoamyl alcohol (24:1) 

6. Isopropanol 

7. Ethanol (70%) 

8. TE buffer (pH 8.0) 

1. GYM broth 

(a) 4.0 g Glucose, 4.0 g Yeast extract, 10. 0 g Malt extract, 
I L Distilled water, pH 7.4 

2. SET buffer 

(a) 75 mM NaCl, 25 mM EDTA, 20 mM Tris, pH 7.5 

3. SDS (10 %) 

4. Lysozyme (10 mg/ml) 

5. Proteinase K (20 mg/ml) 

6. Rnase A (10 mg/ml) 

7. 5 M NaCl 

8. Phenol 

9. Chloroform 

10. Isoamyl alcohol 

11. Isopropanol 

12. Ethanol 

13. Sodium acetate 3 M (pH 5.2) 

1. Yeast extraction buffer A: 2 % Triton X-IOO, I % sodium 
dodecyl sulfate, 100 mM NaCl, 10 mM Tris-HCl, pH 8.0, 
I mM EDTA pH 8.0, Phenol: chloroform: isoamylalcohol: 
phenol is presaturated with 10 mM Tris-HCl, pH 7.5. 

2. Prepare a mixture of 25:24:1 phenol:chloroform:isoamyl 
alcohol (v/v/v). This solution can be stored at room temper- 
ature for up to 6 months, shielded from light. 

3. Glass beads, diameter range 0.04-0.07 mm. Suspended as 
500 mg/ml slurry in distiller water. 

4. Ammonium acetate (4 M). 
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1.2.6. Fungal DNA 
Isolation by CTAB 
Method 



1. CTAB extraction buffer: 0.1 M Tris-HCl, pH 7.5, 1 % CTAB 
(mixed hexadecyltrimethylammonium bromide), 0.7 M 
NaCl, 10 mM EDTA, 1 % 2-mercaptoethanol. Add proteinase 
K to a final concentration of 0.3 mg/ml prior to use. 

2. Chloroformiisoamyl alcohol (24:1). 



1.2.7. Purification of 
DNA by Phenol 
Extraction and Ethanol 
Precipitation 



1 . Phenol 

2. TE buffer, pH 8.0 (10 mM Tris-HCl, pH 8.0; 1 mM EDTA, 
pH 8.0) 

3. 24:1 (v/v) chloroform-isoamyl alcohol 

4. 3 M potassium acetate, pH 5.5, prepared by adding glacial 
acetic acid to 3 M potassium acetate until this pH is obtained 
(store at 4 °C) 

5. Cold 100% ethanol (-20 °C) 

6. Cold 70 % ethanol in sterile dH20 (—20 °C) 



1.2.8. Drop Dialysis 
Method 



1 . Drop dialysis filter 

2. Sterile dialysis buffer (TE pH 8.0) 

3. Petri dish 



1.2.9. Purification on 1 . 

Sephacryl S-500 Spin 2 

Columns , 



Sephacryl S-500 column 
1 X TM buffer 

100 mM Tris-HCl (pH 8.0) 



1.2.10. DNA Fragment 
Purification from 
Agarose or Acrylamide 



1 . Crush and Soak Solution 

(a) 500 mM NH4OAC 3.3 g NH4OAC 

(b) 0.1 %SDS 0.1 gSDS 

(c) 0.1 mM EDTA 20 ml 500 mM EDTA 

Make up to 100 ml with Milli-Q and store at room 
temperature 

2. 3MNaOAcpH5.2 

(a) 24.6 g anhydrous sodium acetate pH to 5.2 with acetic 
acid and bring up to 100 ml with Milli-Q store at room 
temperature 

3. Ethanol 

4. Ethidium bromide (EtBr) 

5. Phenol (do not expose to light) 

6. Choloroform (store in brown bottle) 

7. Other Reagents 

(a) DMCS-treated glass wool (50 g) 

(b) 0.22 mm disposable micro tip filters (syringe type) blue 
tips with melted tips to serve as pestle for crushing 
acrylamide 
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1.2.11. DNA 


1. 


Genomic DNA samples 


Quantification and 


2. 


Lambda Hindlll DNA ladder 


Estimation by Gei 
Electrophoresis 


3. 


Loading dye 


1.2.12. DNA 


1. 


Genomic DNA 


Quantification and 


2. 


Spectrophotometer 


Estimation by 


3. 


Cuvettes 


Spectrophotometer 


4. 


Distilled water 


1 .3 Method 






1.3.1. Bacterial (E. colij 


1. 


Grow the bacterial cells in 500 ml of LB broth medium 


DNA Isolation 


2. 


Subject the above overnight culture to centrifugation to 



obtain a pellet and dissolve the pellet in 5 ml of TE buffer 
[50 mM Tris (pH 8.0), 50 mM EDTA] 

3. Freeze the above cell suspension at —20 °C 

4. To the frozen suspension, add 0.5 ml of 250 mM Tris 
(pH 8.0) and lysozyme (10 mg/ml) and thaw the contents 
at room temperature. After thawing again place on the ice for 
45-50 min 

5. Add 1 ml of 0.5 % SDS, 50 mM Tris (pH 7.5), 0.4 M EDTA, 
1 mg/ml proteinase-K. Incubate in water bath at 50 °C for 
60 min 

6. After incubation, extract with 6 ml of phenol and centrifuge at 
10,000 X ^for 5 min 

7. Transfer top layer to a new tube 

8. Add 0.1 volume of 3 M Na-acetate and mix gently 

9. Add 2 volumes of 9 5 % ethanol and mix by gentle inversion 

10. Spool out the DNA so precipitated and add 5 ml of 50 mM 
Tris (pH 7.5), 1 ml of EDTA, 200 pg/ml RNase. Dissolve it 
overnight by rocldng at 4 °C 

11. Extract with equal volume of chloroform (mix by gentle 
inversion) and centrifuge at 10,000 x ^for 5 min 

12. Transfer top layer to new tube 

13. Add 0.1 volume of 3 M Na acetate and mix gently 

14. Add 2 volumes of 95 % ethanol and gently mix the contents 

15. Spool out the DNA and dissolve in 2 ml of TE buffer 

16. Check the purity of the DNA and store at 4 °C in TE till 
further use 
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1.3.2. Gram - ve 
Bacterial DNA Isolation 
by CTAB Method 



1.3.3. Plasmid DNA 
Isolation 



1 . Grow a 5 ml bacterial culture until saturated. Microcentrifuge 
1.5 ml for 2 min or until a compact pellet forms. Resuspend 
pellet in 567 pi TE buffer. 

2. Add 30 pi of 10 % SDS and 3 pi of 20 mg/ml proteinase-K, 
mix thoroughly, and incubate 1 h at 37 °C. 

3. Add 100 pi of 5 M NaCl and mix thoroughly. 

If NaCl concentration is <0.5 M, the nucleic acid may also 
precipitate. 

4. Add 80 pi of CTAB/NaCl solution, mix thoroughly, and 
incubate 10 min at 65 °C. 

5. Add 1 volume (0.7-0. 8 ml) of 24:1 chloroform/isoamyl 
alcohol, mix thoroughly, and microcentrifuge 4-5 min. 
Transfer supernatant to a fresh tube. If it is difficult to remove 
the supernatant, remove the interface first with a sterile tooth- 
pick. 

6. Add 1 volume of 25:24:1 phenol/chloroform/isoamyl alco- 
hol, extract thoroughly, and microcentrifuge 5 min transfer 
supernatant to a fresh tube. 

7. Add 0.6 volume isopropanol and mix gently until a stringy 
white DNA precipitate forms. Transfer pellet to a fresh tube 
containing 70 % ethanol using a hooked, sealed Pasteur 
pipette. Alternatively, microcentrifuge briefly at room temper- 
ature, discard supernatant, and add 70 % ethanol to pellet. 

8. Microcentrifuge 5 min at room temperature and dry pellet 
briefly in a lyophilizer. Resuspend in 100 pi TE buffer. 
Typical yield is 5-20 DNA/ml startinjy eulture (l(f-10^ 
cells/ml). 

1 . Prepare a small overnight culture of the host with plasmid in 
2 ml of a rich medium (LB) containing the appropriate anti- 
biotic at 37 °C with vigorous shaking. 

2. Pour 1.5 ml of the above culture in a microfuge tube. 
Centrifuge in a microfuge for 30 s at maximum speed. 

3. Discard the supernatant and dry the bacterial pellet. To this 
add 100 pi of ice cold lysis buffer- 1. Vortex vigorously. 

4. Add 200 pi of freshly prepared lysis solution-11 to the above 
suspension. Mix the contents well by inverting the tubes 
several times. 

5. Add 150 pi of ice cold lysis solution 111. Mix the contents by 
inverting the tubes several times. 

6. Centrifuge the above lysate at maximum speed for 5 min. 

7. Discard the pellet (chromosomal DNA) and collect the super 
natant in fresh tube. 
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1.3.4. Actinomycetes 
DNA Isolation 



8. To the supernatant add an equal volume of chloroformdso 
amyl alcohol. Mix and centrifuge at maximum speed for 2-3 
min. Transfer the upper aqueous layer to a fresh tube. 

9. Precipitate the plasmid DNA from the supernatant by adding 
equal volume of ice chilled isopropanol. 

10. Centrifuge at maximum speed for 5 min. 

11. Remove the supernatant and wash the pellet with 70 % of 
ethanol several times. Talte care of the pellet as it may not 
adhere tightly to the tube. 

12. Keep the microcentrifuge tube open so that ethanol is evapo- 
rated and pellet is dry. 

13. Dissolve the pellet in 50 pi of TE and store at — 20 °C. 

1. Grow the mycelia (1-2 ml) in a GYM broth shake culture at 
37 °C for 48 h in an orbital shaker at 120 rpm speed. 

2. Centrifuge the broth at 800 rpm for 10 min and wash the 
pellet with sterile distilled water atleast twice followed by 
centrifugation. 

3 . To the mycelial pellet add 4 ml SET buffer. Add Lysozyme to 
a concentration of 1 mg/ml and incubate at 37 °C for 1 h. 

4. Add 0.1 volumes of 10 % SDS and 0.5 mg/ml proteinase-K 
and incubate at 55 °C with occasional inversion for 2 h. 

5 . Add one-third volume 5 M NaCl and 1 volume of chloroform 
and incubate at room temperature for 0.5 h with frequent 
inversion. 

6. Centrifuge the mixture at 8,000 rpm for 15 min and transfer 
the aqueous phase to a new tube using a blunt-ended pipette 
tip. 

7. Precipitate the chromosomal DNA by the addition of 1 
volume of isopropanol with gentle inversion. 

8. Transfer the DNA to a new tube, rinse with 70 % ethanol, dry 
under vacuum, and dissolve in 100 pi of sterile distilled water. 

9. Treat the dissolved DNA with 20 mg/ml RNase-A at 37 °C 
for 1 h. 

10. Extract the samples with equal volume of phenol/chloro- 
form/isoamyl alcohol (25:24:1) and precipitate with 2.5 vol- 
ume of ice cold ethanol and 0.1 volume of 3 M sodium 
acetate. 

11. Wash the pellets with 70 % ethanol, vacuum dry, and dissolve 
in 100 pi of sterile distilled water. Store the vials at —20 °C. 

12. Check the purity of DNA by agarose gel electrophoresis and 
quantify using spectrophotometer. 
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1.3.5. Yeast DNA 
Isolation 



1.3.6. Fungal DNA 
Isolation by CTAB 
Method 



1. Collect cells from fresh 5 ml culture by centrifugation at 
2,000 X ^for 10 min and resuspend in 0.5 ml of water. 

2. Transfer cells to 1.5 -ml microfuge tube and collect by centri- 
fugation at 15,000 X ^for 10 min. Pour off supernatant and 
resuspend in residual liquid. 

3. Add 0.2 ml of buffer A, 200 pi of glass beads, and 0.2 ml of 
phenobchloroformdsoamyl alcohol (25:24:1). 

4. Vortex for 3 min and add 0.2 ml of TE. 

5. Centrifuge at 15,000 x ^for 5 min and then transfer aqueous 
to new tube. 

6. Add 1 ml of 100 % EtOH (room temperature), invert tube to 
mix, and centrifuge at 15,000 x ^for 2 min. 

7. Discard supernatant and resuspend pellet in 0.4 ml of TE 
(no need to dry pellet). 

8. Add 10 pi of 4 M ammonium acetate, mix, and then add 1 ml 
of 100 % EtOH and mix. 

9. Centrifuge at 15,000 x ^for 2 min and dry pellet. Resuspend 
in 50 pi of TE. 

1. Grind 0.2-0. 5 g (dry weight) of lyophilized mycellar pad in a 
mortar and pestle. 

2. Transfer to a 50-ml disposable centrifuge tube. 

3. Add 10 ml (for a 0.5 g pad) of CTAB extraction buffer. 

4. Gently mix to wet the entire powdered pad. 

5. Place in 65 °C water bath for 30 min. 

6. Cool and add an equal volume of chloroform /isoamyl alcohol 
(24:1). 

7. Mix and centrifuge at 2,000 x ^for 10 min at room temper- 
ature. 

8. Transfer aqueous supernatant to a new tube. 

9. Add an equal volume of isopropanol. 

10. High molecular weight DNA should precipitate upon mixing 
and can be spooled out with a glass rod or hook. 

11. Rinse the spooled DNA with 70 % ethanol. 

12. Air dry, add 50 pi of TE containing 20 pg/ml RNAse A. To 
resuspend the samples, place in 65 °C bath, allow pellets to 
resuspend overnight at 4 °C. 



1.3.7. Phenol 
Extraction and Ethanol 
Precipitation 



1. Add an equal volume of phenol to the DNA containing 
reaction mixture and vortex gently. 

2 . Separate the aqueous phase which contains the DNA from the 
organic phase by centrifugation in the microfuge, at 
2,000 rpm for 5 min or at 8,000 rpm for 1 min. 




1.3.8. Drop Dialysis 
Method 



1.3.9. Purification on 
Sephacryl S-500 spin 
columns 
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3. Remove the aqueous phase with care into a fresh microfiige 
tube and add an equal amount of 24:1 (v/v) chloroform-i- 
soamyl alcohol. 

4. In order to precipitate the DNA, add a 0.1 volume of 3 M 
sodium acetate, pH 5.5, to the aqueous phase and then 

2 volumes of absolute ethanol. Incubate at — 20 °C overnight 
or for shorter periods at —80 °C (e.g., 20-30 min). 

5 . Recover the precipitated DNA by centrifugation in the micro- 
fiige at 10,000 rpm for 5-15 min. Remove the ethanol with 
care and dry the pellet in a desiccator or 50 °C oven for 5 min. 
An extra wash with 70 % (v/v) ethanol may be included to 
remove excess salt from the pellet. The dried DNA may be 
resuspended in sterile TE, pH 8 .0, or water and stored at 4 °C 
for further manipulation or at —20 °C for long-term storage. 

6. This procedure denatures and removes contaminating protein 
from a DNA sample. 

1. Gently place a drop dialysis filter, floating correct-side up, on 
10-20 ml of sterile dialysis buffer (TE, pH 8.0, or water) in a 
Petri dish. 

2. Gently pipette the DNA sample (10-100 pi) onto the filter. 

3. Allow to dialyze for 1-2 h before removing the DNA for 
further analysis. 

1. Thoroughly mix a fresh new bottle of Sephacryl S-500, 
distribute in 10 ml portions, and store in screw cap bottles or 
centrifuge tubes in the cold room. 

2. Prior to use, briefly vortex the matrix and without allowing 
to settle, add 500 pi of this slurry to a mini-spin column 
(Millipore) which has been inserted into a 1.5-ml microcen- 
trifuge tube. 

3. hollowing centrifugation at 2,000 rpm in a table top centri- 
fuge, carefully add 200 pi of 100 mM Tris-HCl (pH 8.0) to 
the top of the Sephacryl matrix and centrifuge for 2 min. 
at 2,000 rpm. Repeat this step twice more. Place the Sephacryl 
matrix-containing spin column in a new microcentrifuge 
tube. 

4. Then, carefully add 40 pi of DNA to the Sephacryl matrix 
(saving 2 pi for later agarose gel analysis) and centrifuge at 
2,000 rpm for 5 min. Remove the column, save the solution 
containing the eluted, large DNA fragments (fraction 1). 
Apply 40 pi of 1 xTM buffer and recentrifuge for 2 min at 
2,000 rpm to obtain fraction 2 and repeat this IxTM rinse 
step twice more to obtain fractions 3 and 4. 

5 . To check the DNA purity, load 3-5 pi of each eluant fraction 
onto a 0.7 % agarose gel that includes as controls, 2 pi of DNA 
saved from step 4 above. 
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1.3.10. DMA Fragment 
Purification from 
Agarose or Acrylamide 

1.3. 10. 1. Agarose Gels 



1.3. 10.2. Acrylamide Gels 



1.3.11. DMA 
Quantification and 
Estimation by Gel 
Electrophoresis 



1. Prepare spin columns by cutting off the cap of a 0.5 ml 
eppendorf tube and forming a hole in the bottom with a hot 
18 gauge needle. Fill this “mini-column” with a small ball of 
DMCS- treated glass wool and pack down with a pipet tip. 

2. Cut out the desired band from an agarose gel and place in 
a spin column inside a 1.5 -ml eppendorf tube with the top 
cut off. 

3. Spin at 6,000 rpm in a microfuge for 10 min. 

4. Phenol/chloroform extract flow through and EtOH precipi- 
tate with glycogen or tRNA and 10 % v/v of 3 M NaOAc, 
pH 5.2. 

5. Wash and dry, resuspend in 20 pi TE, run 10 pi on a gel, and 
use 1-2 pi for a ligation. 

1. Run a 4-6 % acrylamide gel in lx TBE, stain in EthBr 
(1-10 mg/ml), and cut out the desired band. 

2. Crush the acrylamide with a plOOO tip with a melted end to 
resemble a pestle for the eppendorf “mortar.” 

3. Add 1 ml crush and soak solution and incubate overnight at 
37 °C. 

4. Spin in the microfuge for 10 min at 14,000 rpm. Remove as 
much liquid as possible and add another 500 pi of crush and 
soak solution. 

5. Repeat the spin and pool the recovered supernatant. 

6. Add 0.1 volumes of 3 M NaOAc, 2.5 volumes of EtOH, and 
carrier (see above). 

7. Spin as usual, wash, and dry. Resuspend in 20 pi TE. 

1 . Obtain ice bucket and keep DNA samples on ice 

2. Heat Hind 111 ladder at 60-65 °C for 3 min 

3. Place on ice 

4. Cut small parafilm piece 

5 . Pipet 3 pi dots loading dye onto parafilm 

6. Pipet 1 pi each DNA sample onto loading dye 

7. Pipet 1 pi Hindlll DNA onto loading dye dot 

8 . Record DNA positions in lab notebook 

9. Load gel 

10. Store remaining DNA at -20 °C 
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bpng/1 pg 



23130 * 


477 


9416 


194 


6557 


135 


4361 * 


90 



2322 48 

2027 42 



564 11 



Fig. 1.2 Determine the DNA concentration by comparing with the band intensity 

1 1 . Plug gel box into power supply 

12. Set voltage to ~80 V 

13. Run gel until loading dye reaches approximately 3/4 th of the 
gel length 

14. Visualize and photograph in gel doc room 

1 5 . Dispose of gel 

16. By comparing the brightness of the bands, you can estimate 
your DNA concentration. For instance, if your band is in 
halfway between the 4,361 bp and 2,322 bp fragments, you 
could estimate that your concentration is ~70 ng/1 pg 

17. Alternately, you can determine the concentration by calcula- 
tions. For example, if your band’s brightness seems similar to 
the brightness of the 4,361 band (Fig. 1.2), you perform the 
following calculation; 



48,502 bp 



X 4,361 bp X 



Ing 

1,000 pg 



Xng/pl 



1.3.12. DNA 
Quantification and 
Estimation by 
Spectrophotometer 



1. Use H 2 O or 1 X TE as a solvent to suspend the nucleic acids 
and place each sample in a quartz cuvette. 

2. Zero the spectrophotometer with a sample of solvent. 

3. For more accurate readings of the nucleic acid sample of 
interest, dilute the sample to give readings between 0.1 
and 1.0. 
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4. For a 1-cm path length, the optical density at 260 nm 
(OD260) equals 1.0 for the following solutions: 

(a) 50 pg/ml solution of dsDNA 

(b) 33 pg/ml solution of ssDNA 

(c) 20-30 pg/ml solution of oligonucleotide 

(d) 40 pg/ml solution of RNA 

5. Contamination of nucleic acid solutions makes spectrophoto- 
metric quantitation inaccurate. 

6. Calculate the OD260/OD280 ratio for an indication of nucleic 
acid purity. 

7. Pure DNA has an OD260/OD280 ratio of ~1 .8; pure RNA has 
an OD260/OD280 ratio of ~2.0. 

8. Low ratios could be caused by protein or phenol conta- 
mination. 

1.3.13. Calculation A sample of dsDNA was diluted 50 x . The diluted sample gave a 

reading of 0.65 on a spectrophotometer at OD260. To determine 
the concentration of DNA in the original sample, perform the 
following calculation: 

1. dsDNA concentration =50 pg/ml x OD260 x dilution 
factor 

2. dsDNA concentration = 50 pg/ml x 0.65 x 50 

3. dsDNA concentration = 1.63 mg/ml 
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Chapter 2 



Fluorescent-Based Detection, Quantitation, and Expression 
of Viral Gene by qRT-PCR 

Shelly Praveen and Vikas Koundal 

Abstract 

Using fluorescent reporter molecules, viral gene(s) can be quantified for diagnostics as well as for gene 
expression studies by quantitative PCR. The process is based on the detection of the fluorescence 
produced by a reporter molecule which increases, as the reaction proceeds. This occurs due to the 
accumulation of the PCR product with each cycle of amplification. The procedure follows the general 
principle of polymerase chain reaction; its key feature is that the amplified DNA is quantified as it 
accumulates in the reaction in real time after each amplification cycle. Here we discuss the detailed 
explanation of various fluorescent molecules and strategies to determine the viral load. These experiments 
are equally efficient in determining viral gene expression studies. 



2.1 Introduction 



Over last several years, the development of novel chemistries and 
instrumentation platforms enabling detection of PCR products 
on real-time basis has lead to wide spread adoption of quantitative 
PCR (Q-PCR/qPCR) as the method of choice for diagnostic and 
analyzing changes in gene expression [1]. It is called “real-time 
PCR,” because it allows us to monitor the increase in the amount 
of DNA as it is amplified. Real-time polymerase chain reaction is 
also loiown as kinetic polymerase chain reaction, because it is 
the most sensitive technique to amplify and simultaneously quan- 
tify a targeted DNA molecule compared to the commonly used 
Northern and Southern blotting techniques. It enables both 
detection and quantification (as absolute number of copies or 
relative amount when normalized to DNA input or additional 
normalizing genes) of a specific sequence in a DNA sample. 
Q-PCR can be used to quantify RNA levels from much smaller 
samples. In fact, this technique is sensitive enough to enable 
quantitation of RNA from a single cell. 
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Real-time PCR or Q-PCR is a variation of the standard PCR 
technique used to quantify DNA or messenger RNA (mRNA) in a 
sample. Quantification of amplified product is obtained using 
fluorescent probes and specialized machines that measure fluores- 
cence while performing temperature changes needed for the PCR 
cycles. Real-time PCR is based on the detection of the fluores- 
cence produced by a reporter molecule which increases, as the 
reaction proceeds. This occurs due to the accumulation of the 
PCR product with each cycle of amplification. These fluorescent 
reporter molecules include dyes that bind to the double-stranded 
DNA (i.e., SYBR Green) or sequence-specific probes. The proce- 
dure follows the general principle of polymerase chain reaction; its 
key feature is that the amplified DNA is quantified as it accumu- 
lates in the reaction in real time after each amplification cycle. 
Using sequence-specific primers, the relative number of copies 
of a particular DNA or RNA sequence can be determined. 
We use the term relative since this technique tends to be used to 
compare relative copy numbers between tissues, organisms, or 
different genes relative to a specific housekeeping gene (or refer- 
ence gene). Housekeeping genes are constitutively expressed at a 
relatively constant level in cells, and they are always present in all 
known conditions. To obtain the value of the target gene under 
investigation and the value of the housekeeping gene in the same 
sample, a standard curve can be used. In this, the absolute con- 
centration of the target gene is divided by the absolute concentra- 
tion of the housekeeping gene. The resulting target/reference 
ratio that expresses the amount of target gene is then normalized 
to the level of the reference gene within each unknown sample [2] . 
The quantification arises by measuring the amount of amplified 
product at each stage during the PCR cycle. DNA/RNA from 
genes with higher copy numbers will appear after fewer melting, 
annealing, extension PCR cycles. 

We present here how to quantitate 2b gene (RNAi suppressor) 
of Cucumber mosaic virus (CMV) in different plant samples, using 
florescent molecules [3]. CMV is the type member of the Cucumo- 
virus genus, in the family Bromoviridae. Cucumber mosaic, first 
described in 1916, was one of the earliest plant diseases attributed 
to a virus. CMV genome consists of positive sense, single-stranded 
RNA. CMV encodes five proteins, distributed on three genomic 
RNAs, i.e., RNAI which is the only monocistronic RNA, encoding 
the la protein that is required for viral replication and contains 
methyl-transferase and helicase motifs. RNA2 encodes the 2a pro- 
tein, the viral polymerase, and the 2b protein, the RNAi suppressor. 
RNAS encodes the movement protein (MP), and the coat protein 
(CP) expressed from the subgenomic RNA4 and the satellite RNAs 
(satRNAs). The subgenomic RNA4 and the satellite RNAs of CMV 
are small linear RNA diat does not carry any apparent coding 
capacity (Fig. 2.1). 
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Fig. 2.1 CMV genome consists of single-stranded positive sense RNA. 

CMV infects over 1,000 species of hosts, including members 
of 85 plant families, making it the broadest host range virus 
known. Tomatoes infected with the Cucumber mosaic virus 
develop a slight yellowing and mottling of the older leaves. The 
expanding leaves typically become twisted, curl downward, and 
develop a “shoestring” appearance as a result of a restriction of the 
leaf surface to a narrow band around the midrib of the leaf. 
Diseased plants are stunted and produce poor fruit yield. 



2.2 Materials 



2.2.1. Instrument and 
Setup 



1 . Real-time PCR thermal cycler; Tight Cycler®’ 480 II (Roche). 

2. Clear TightCycler® 480 multiwell plates. 

3. TightCycler® 480 Sealing Foil. 

4. 2x Tight CyclerR 480 SYBR Green I master mix/TaqMan 
probe. 

5 . Gene - specific primers . 



2.2.2. QRT-PCR 


Total RNA (as CMV is a RNA virus) 


from the infected plant 


Reaction 


leaf samples 




Requirements (Using 






TaqMan Probe) 


1. RNA (10-100 ng) 


30.0 pi 



2. lOx TaqMan buffer 


5.0 pi 


3. MgCT (25 mM) 


5.0 pi 


4. dNTPs (10 mM) 


2.0 pi 


5. Primer F (10 pM) 


2.5 pi 


6. Primer R(10 pM) 


2.5 pi 


7. TaqMan probe (10 pM) 


1.0 pi 


8. Tag' Polymerase (5 U) 


0.5 pi 


9. M-MuLV Reverse Transcriptase (20 U) 


0.5 pi 


10. RNase inhibitor (20 U) 


1.0 pi 
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2.2.3. Cycling 
Parameters 



2.2.4. Components 

2.2.4. 1. Primers and Probe 



2.2.4.2. Probe Selection 
Criteria 



1. Reverse transcription (using M-MuLV) at 48 °C for 30 min. 

2. Taq activation 95 °C for 10 min. 

PCR profile: 

1. Denaturation at 95 °C for 15 s. 

2. Annealing/extension at 60 °C for 1 min (repeated 40 times). 

Whenever possible, primers and probes should be selected in a 
region with a G/C content of 30-80 %. Regions with G/C 
content in excess may not denature well during thermal cycling, 
leading to a less efficient reaction. In addition, G/C-rich 
sequences are susceptible to nonspecific interactions that may 
reduce reaction efficiency and produce nonspecific signal in 
SYBR Green assays. For this same reason, primer and probe 
sequences containing repeats of four or more G bases should 
be avoided. A/T-rich sequences require longer primer and 
probe sequences in order to obtain the optimum melting tem- 
peratures. This is rarely a problem for quantitative assays; however, 
probes approaching 40 base pairs can exhibit less efficient quench- 
ing and produce lower synthesis yields. Primer should be highly 
purified ideally; HPLC purified primers should be used and their 
concentration should be in the range of 0.3-1 pM, ideally 0.5 pM. 
The last five bases on the 3' end of the primers should contain no 
more than two C and/or G bases, which is another factor that 
reduces the possibility of nonspecific product formation. Under 
certain circumstances, however, such as a G/C-rich template 
sequence, this recommendation may have to be relaxed to keep 
the amplicon under 150 base pairs in length. It should be followed 
as often as possible, and even when it is not possible, primer 
3' ends extremely rich in G and/or C bases should be avoided. 
The of primers is adjusted in the range of 58-60 °C as both the 
annealing and extension step are achieved in a single step of real- 
time PCR. 

1 . Select the probe first and design the primers as close as possi- 
ble to the probe without overlapping it. 

2. Keep the G/C content in the 30-80 % range. 

3. Avoid runs of an identical nucleotide, especially for guanine, 
where repeats of four or more should be avoided, and there 
should be no G on the 5' end. 

4. Tjn of the probe should be in the range of 60-70 °C. 

5. Select the probe with more C compared to G bases. 

Selecting primers and probes with the recommended is 
one of the factors that allow the use of universal thermal cycling 
parameters. Having the probe Tjn 8-10 °C higher than that of the 
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2.2.4.3. Template 



2.2.5. General 
Recommendations for 
Real-Time RT-PCR 



2.3 Methods 

2.3.1. standard Curve 
Method 



primers ensures that the probe is fully hybridized during primer 
extension. The required parameters for well-designed primers and 
probe have been well documented. These parameters include Tn 
for the probe that is 10 °C higher than the primers, primer 
between 58 and 60 °C, amplicon size between 50 and 150 bases, 
and absence of 5' Gs. 

A critical aspect of performing real-time PCR is to begin with a 
template that is of high purity. The DNA should be about 5-30 ng 
in concentration ideally; 25 ng DNA template is used in 20 pi 
reaction mix. Size of ampilcon should be <500 bp. Small ampli- 
cons are favored because they promote high -efficiency assays. In 
addition, high-efficiency assays enable relative quantification to be 
performed using the comparative method or threshold cycle (Q). 
This method increases sample throughput by eliminating the need 
for standard curves when looking at expression levels of a target 
relative to a reference control. 

The optimal concentrations of the real-time PCR reagents are as 
follows: 

1 . Magnesium chloride concentration should be between 4 and 
7mM. 

2. Concentrations of dNTPs in TaqMan reaction should be 
200 pM of each dNTPs. 

Typically 1.25 U of Taq DNA polymerase is used in a 50-pl 
reaction mixture. This is the minimum requirement; if necessary, 
optimization can be done by increasing this amount by 0.25 U 
increments. 



In this method, a standard curve is first plotted from DNA/RNA 
sample of known concentration. This curve is then used as a 
reference standard for extrapolating quantitative information for 
samples of unknown DNA/RNA concentration. Nucleic acids 
like DNA, RNA, in vivo generated ssDNA or any cDNA sample 
can be used to construct standard curve. For the standard curve 
first the standard sample is quantified accurately, spectrophoto- 
metrically and is then converted to copy number based on molec- 
ular weight of the sample used. In this method, a standard curve is 
first plotted from DNA/RNA sample of known concentration. 
This curve is then used as a reference standard for extrapolating 
quantitative information for samples of unknown DNA/RNA 
concentration. Nucleic acids like DNA, RNA, in vivo generated 
ssDNA or any cDNA sample can be used to construct standard 
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2.3.2. Comparative 
Threshold (Ct) Method 




Threshold cycles 



Fig. 2.2 Standard curve for absolute quantitation. 



curve. For the standard curve first the standard sample is quanti- 
fied accurately, spectrophotometrically and is then converted to 
copy number based on molecular weight of the sample used 
(Fig. 2.2). 

Though RNA standards can be used, their stability can be a 
source of variability in the final analyses also; using RNA standards 
involves the construction of cDNA plasmids that have to be 
in vivo transcribed into the RNA standards. To check the variation 
introduced due to the variable RNA inputs, normalization can be 
done using a housekeeping gene. 

This involves comparing the Q values of sample of interest with a 
control or calibrator such as a non-treated sample or RNA from 
normal tissue. The Q value of both the sample of interest and 
calibrator are normalized to an appropriate endogenous house- 
keeping gene. 

The comparative Q method is also Icnown as 2- [delta] [delta] 
Q method, where: 

[delta] [delta] Q = [delta] Q, sample — [delta] Q, reference 

Here, [delta] Q, sample is the Q value for any sample normal- 
ized to the endogenous housekeeping gene and [delta] Q, refer- 
ence is Q value for the calibrator also normalized to the 
endogenous housekeeping gene. 

For the [delta] [delta] Q calculation to be valid, the amplifi- 
cation efficiencies of the target and the endogenous reference 
must be approximately equal. This can be estimated by looldng 
at how [delta] Q varies with template dilution. If the plot of 
cDNA dilution versus delta Q is close to zero, it implies that the 
efficiencies of the target and housekeeping genes are very similar. 
If a housekeeping gene cannot be found whose amplification 
efficiency is similar to the target, then the standard curve method 
is preferred (Fig. 2.3). 
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Cycles 

Fig. 2.3 Graph representing a typical curve with critical threshold value. 



2.3.3. Reaction Set and 
Thermal Cycling 



1 . Prepare the reaction mix (according to the components given 
in Sect. 2.2.2) and load it on the multi-well plate. 

2. Set up the cycling parameter (shown in Sect. 2.2.3 for 
CMV-2b gene). 

3. Run the program and analyze the results. 



2.4 Notes 



The quantification is done by measuring the amount of amplified 
product at each stage during the PCR cycle. DNA/RNA from 
genes with higher copy numbers will appear after fewer PCR 
cycles. Quantification of amplified product is obtained using fluo- 
rescent probes and specialized machines that measure fluorescence 
while performing temperature changes required for the PCR 
cycles. Real-time PCR is based on the detection of the fluores- 
cence produced by a reporter molecule which increases, as the 
reaction proceeds. This occurs due to the accumulation of the 
PCR product with each cycle of amplification. These fluorescent 
reporter molecules include dyes that bind to the double-stranded 
DNA (i.e., SYBR Green) or sequence-specific probes (TaqMan 
probes). The procedure follows the general principle of polymer- 
ase chain reaction; its key feature is that the amplified DNA is 
quantified as it accumulates in the reaction in real time after each 
amplification cycle. 
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2.4.1. SYBER Green 
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Fig. 2.4 SYBER Green; A fluorescent dye binds with double-stranded DNA. 



SYBER Green provides the simplest and most economical format 
for detection and quantitation of PCR products in real-time reac- 
tions. SYBER Green binds double stranded DNA and upon exci- 
tation emits light. An increase in DNA product during PCR leads 
to an increase in fluorescence intensity and is measured at each 
cycle, thus allowing DNA concentrations to be quantified 
(Pig. 2.4). 

The advantages of SYBER Green are that it is inexpensive, 
easy to use, and sensitive. However, dsDNA dyes such as SYBR 
Green binds to all dsDNA PCR products, including nonspecific 
PCR products (such as “primer dimers”). This can potentially 
interfere with/or prevent accurate quantification of the intended 
target sequence. For single product PCR reactions with well- 
designed primers, SYBER Green can work extremely well, with 
spurious nonspecific background showing up in very late cycles. 

1 . The reaction is prepared as usual, with the addition of fluores- 
cent dsDNA dye (instead of TaqMan probe as described in 
protocol). 

2. The reaction is run in a thermocycler, and after each cycle, the 
levels of fluorescence are measured with a detector; the dye 
only fluoresces when bound to the dsDNA (i.e., the PCR 
product). With reference to a standard dilution, the dsDNA 
concentration in the PCR can be determined. 

Like other real-time PCR methods, the values obtained 
do not have absolute units associated with it (i.e., DNA/RNA 
copies/cell). A comparison of a measured DNA/RNA sample to a 
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2.4.2. Hydrolysis Probe 





Fig. 2.5 The TaqMan probe. The red circle represents the quenching dye that disrupts 
the observable signal from the reporter dye (green circle) when it is within a short 
distance. 



standard dilution will only give a fraction or ratio of the sample 
relative to the standard, allowing only relative comparisons between 
different tissues or experimental conditions. To ensure accuracy in 
the quantification, it is usually necessary to normalize expression of a 
target gene to stably expressed housekeeping genes, e.g., Ubiqui- 
tine. SYBR Green is the most widely used double-strand DNA- 
specific dye reported for real-time PCR. SYBR Green binds to the 
minor groove of the DNA double helix. In the solution, the 
unbound dye exhibits very little fluorescence. This fluorescence is 
substantially enhanced when the dye is bound to double-strand 
DNA. SYBR Green remains stable under PCR conditions and the 
optical filter of the tliermocycler can be affixed to harmonize the 
excitation and emission wavelengths. Ethidium bromide can also be 
used in real-time PCR for detection of amplification but its carcino- 
genic nature renders its use restrictive. 

The hydrolysis probe chemistry relies on the 5'-3' exonuclease 
activity of Taq polymerase, which degrades a hybridized non- 
extendible DNA probe [4] during the extension step of the 
PCR, e.g., TaqMan probes. TaqMan probe is designed to hybridize 
in a region within the amplicon and is dual labeled with a reporter 
dye and a quenching dye. The reporter dye is attached to the 5' 
end of the probe and the quencher at the 3' end. The close 
proximity of the reporter to quencher prevents detection of its 
fluorescence (Fig. 2.5). 

During the annealing stage of the PCR, both probe and 
primers anneal to the DNA target. Polymerization of a new 
DNA strand is initiated by TaqMan polymerase from the primers, 
and once the polymerase reaches the probe, its 5'-3' exonuclease 
activity degrades the probe, physically separating the fluorescent 
reporter from the quencher, resulting in fluorescence, and this 
fluorescence is detected and measured in the real-time PCR ther- 
mocycler. The more times the denaturing and annealing takes 
place, the more opportunities there are for the TaqMan probe to 
bind and, in turn, the more emitted light is detected, and its 
geometric increase corresponding to exponential increase of tlie 
product is used to determine the tlireshold cycle ( Q) in each reaction. 

Fluorescent reporter probes are more accurate and reliable of 
the method. It uses a sequence-specific RNA or DNA-based 
probe to quantify only the DNA containing the probe sequence; 
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2.4.3. Hybridization 
Probe 



2.4.3.I. FRET Probes 



therefore, use of the reporter probe significantly increases 
specificity, and allows quantification even in the presence of 
some nonspecific DNA amplification. This potentially allows for 
multiplexing assaying for several genes in the same reaction by 
using specific probes with different colored labels, provided 
that all genes are amplified with similar efficiency. Well-designed 
TaqMan probes require very little optimization. 

The hybridization probes are used for DNA detection and quan- 
titation, providing maximum specificity for product identification. 
Two specifically designed, sequence-specific oligonucleotide 
probes, labeled with different dyes, are used. The sequences of 
the probes are selected so that they can hybridize to the target 
sequences on the amplified DNA fragment in a head-to-tail orien- 
tation, thus bringing the two dyes into close proximity. The donor 
dye (fluorescein) is excited by the blue light source and emits 
green fluorescent light at a slightly longer wavelength. At close 
proximity, the energy emitted excites the acceptor dye attached to 
the second hybridization probe, which then emits fluorescent 
light at a different wavelength. The amount of fluorescence emit- 
ted is directly proportional to the amount of target DNA gener- 
ated during the PCR reaction. 

Fluorescence resonance energy transfer (FRET) is transfer of energy 
from excited state, i.e., from the initially excited donor (D) to an 
acceptor (A). The hybridization probe system consists of two 
oligonucleotides labeled with different marker fluorescent dyes 
and these two probes designed to hybridize in close proximity on 
the target DNA. Interaction of the two dyes can only occur when 
both are bound to their target. The donor probe is labeled with 
fluorophore at the 3' end and the acceptor probe at 5' end. During 
PCR, the two different oligonucleotides hybridize to adjacent 
regions of the target DNA such that the fluorophores, which are 
coupled to the oligonucleotide, are in close proximity in the hybrid 
structure. The donor fluorophore is excited by an external light 
source and then passes part of its excitation energy to the adjacent 
acceptor fluorophore. The excited acceptor fluorophore emits light 
at a longer wavelength, which can then be detected and measured 
(Fig. 2.6). The light source cannot excite the acceptor dye. 
Applications of FRET probes are in: 

1 . Quantitative PCR. 

2. DNA copy number measurements. 

3. Pathogen detection assays. 

4. Single nucleotide polymorphism (SNP) genotyping. 

5. Verification of microarray results. 
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2.4.3.2. Molecular Beacons 



2. 4. 3. 3. Scorpions 



Fig. 2.6 Fluorescence 
resonance energy transfer 
(FRET). 




Molecular beacons are hairpin- shaped oligonucleotide in their 
unhybridized state containing a fluorophore on one end and a 
quenching dye on the opposite end. Under conditions when 
probe is not hybridized to its complementary target, the fluores- 
cent and quenching dye remain proximal to one another, thus 
preventing fluorescence resonance energy transfer (FRET). 
Whereas, when the probe encounters a target molecule, it forms 
a probe-target hybrid that is longer and more stable than the stem 
hybrid; this causes the fluorophore and the quencher to move 
away from each other and causes emission of fluorescence. Molec- 
ular beacons are designed so that their probe sequence is just long 
enough for a perfectly complementary probe-target hybrid to be 
more stable than the stem hybrid. The length of the probe 
sequence (10-40 nt) is chosen in such a way that the probe target 
hybrid is stable in the conditions of the assay. The stem sequence 
(5-7 nt) is chosen to ensure that the two arms hybridize to each 
other but not to the loop sequence (Fig. 2.7). 

The computer program is used to predict melting tempera- 
ture of the stem and also to predict whether the intended stem- 
and-loop conformation will occur or not. Molecular beacons can 
be synthesized that possess differently colored fluorophores, 
enabling assays to be carried out that simultaneously detect differ- 
ent targets in the same reaction. 

Molecular beacons are thus ideal probes for use in diagnostic 
assays designed for genetic screening, SNP detection, and phar- 
macogenetic applications. In summary, molecular beacons have 
three key properties that enable the design of new and powerful 
diagnostic assays: 

1. They only fluoresce when bound to their targets. 

2. They can be labeled with a fluorophore of any desired color. 

3. They are so specific that they easily discriminate single-nucle- 
otide polymorphisms. 

In Scorpion probes, sequence-specific priming and PCR product 
detection are achieved using a single oligonucleotide. The Scor- 
pion probe maintains a stem-loop configuration in the unhybri- 
dized state. The fluorophore is attached to the 5' end and is 
quenched by a moiety coupled to the 3' end. The 3' portion of 
the stem also contains sequence that is complementary to the 
extension product of the primer. This sequence is linked to the 
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Fig. 2.7 Molecular beacon; a hairpin fluorescent probe. 
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Fig. 2.8 Scorpion probe; a single oligonucleotide used in priming as well as in probing. 



5' end of a specific primer via a non-amplifiable monomer. After 
extension of the Scorpion primer, the specific probe sequence is 
abie to bind to its complement within the extended amplicon thus 
opening up the hairpin loop. This prevents the fluorescence from 
being quenched and a signal is observed (Fig. 2.8). 
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It is possible to choose between the open and closed Scorpion 
format. Closed format means that the probe part of the Scorpion 
is designed to have two stems at each end that are complimentary 
to each other so that it will be in a beacon-like (link) secondary 
structure when it is not yet hybridized to the primer’s extension 
product. This way a fluor and quencher that are attached to the 
5' and 3' ends of the probe are in close proximity to each other. 
Hence, when the Scorpion is free in solution no fluorescence can 
be detectable. When the Scorpion unfolds as the probe binds to 
the extended primer, the fluor and quencher will be separated and 
fluorescence can be detected as to quantify the amount of PCR 
product. In the open format, the probe part of the Scorpion does 
not have a specific secondary structure in the unhybridized form 
and contains a fluor. A separate quencher oligonucleotide is 
designed simultaneously. This quencher will bind to the probe 
part of the Scorpion when the Scorpion is not bound to its 
intended target so as to prevent fluorescence. As the Scorpion 
binds to the target, the quencher and probe will be separated 
from each other when the probe hybridizes to the extension 
product of the primer and hence, fluorescence can be detected 
and used to quantify the amount of PCR product. 
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Abstract 

The ability to cleave DNA at specific sites is one of the cornerstones of today’s methods of DNA 
manipulation. Restriction endonucleases are intended to cleave duplex DNA at specific target sequences 
with the production of defined fragments. Restriction fragment length polymorphisms (RFLPs) are 
differences in genomic DNA sequences between individuals that are revealed by cleaving each individual’s 
DNA with restriction enzymes, separating the DNA fragments according to size. Each enzyme cuts the 
palindrome at a particular site, and two different enzymes may have the same recognition sequence, but 
cleave the DNA at different points within that sequence. The cleavage sites fall into three different 
categories, either flush (or blunt) in which the recognition site is cut in the middle, or with either 5'- or 
3'-overhangs, in which case unpaired bases will be produced on both ends of the fragment. 



3.1 Introduction 



Restriction endonucleases are a class of enzymes that cut DNA 
molecules. Each enzyme recognizes a unique sequence of nucleo- 
tides in the DNA strand, usually about 4-6 base pairs long. The 
sequences are palindromic in that the complimentary DNA strand 
has the same sequence only in the reverse direction, so both 
strands of DNA are cut at the same location [ 1 ] . 

Restriction enzymes are found in many different strains of 
bacteria where their biological role is to participate in cell defense. 
These enzymes “restrict” foreign (e.g., viral) DNA that enters the 
cell, by destroying it. Among the first of these “restriction 
enzymes” to be purified were EcoBl and EcoBll from Escherichia 
coli^ and Hindll and Hindlll from Haemophilus influenzae. 
These enzymes were found to cleave DNA at specific sites, gen- 
erating discrete, gene -size fragments that could be rejoined. 
Researchers were quick to recognize that restriction enzymes 
provided them with a remarkable new tool for investigating 
gene organization, function, and expression. The host cell has 
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a restriction-modification system that methylates its own DNA 
at sites specific for its respective restriction enzymes, thereby 
protecting it from cleavage. Over 800 known enzymes have 
been discovered that recognize over 100 different nucleotide 
sequences. 

There are three different types of restriction enzymes. Type I 
cuts DNA at random locations as far as 1,000 or more base pairs 
from the recognition site. Type 111 cuts at ~25 base pairs from the 
site. Types 1 and 111 require ATP and may be large enzymes with 
multiple subunits. Type 11 enzymes, which are predominantly 
used in biotechnology, cut DNA within the recognized sequence 
without the need for ATP and are smaller and simpler. Type 11 
restriction enzymes are named according to the bacterial species 
from which they are isolated [2]. For example, the enzyme EcoRl 
was isolated from E. coli. 

Type 11 restriction enzymes can generate two different types 
of cuts depending on whether they cut both strands at the center 
of the recognition sequence or each strand closer to one end of the 
recognition sequence. The former cut will generate “blunt ends” 
with no nucleotide overhangs. The latter generates “sticky” or 
“cohesive” ends, because each resulting fragment of DNA has an 
overhang that compliments the other fragments. Both are useful 
in molecular genetics for making 

Restriction enzymes are exceedingly varied; they range in size 
from the diminutive Pvull (157 amino acids) to the giant Cjel 
(1,250 amino acids) and beyond. Among over 3,000 activities 
that have been purified and characterized, more than 250 different 
sequence specificities have been discovered [3]. The search for 
new specificities continues, both biochemically, by the analysis of 
cell extracts, and computationally, by the analysis of sequenced 
genomes. Although most activities encountered today turn out to 
be duplicates — isoschizomers — of existing specificities, restriction 
enzymes with new specificities are found with regularity. 

Restriction enzymes are used in biotechnology to cut DNA into 
smaller strands in order to study fragment length differences among 
individuals (Restriction Fragment Lengtli Polymop hism — RFTP) 
or for gene cloning. RFLP techniques have been used to determine 
that individuals or groups of individuals have distinctive dif ferences 
in gene sequences and restriction cleavage patterns in certain areas 
of the genome. Knowledge of these unique areas is the basis for 
DNA fingerprinting. Each of these methods depends on the use 
of agarose gel electrophoresis for separation of the DNA fragments. 

Most DNA in the genome is not involved in coding 
sequences; much of this DNA is thus not subject to strong 
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selective pressure for maintaining identical sequences from one 
individual to the next. Within noncoding DNA regions unrelated 
individuals exhibit approximately one base pair change per 200 bp. 
These functionally silent variations are inherited according to strict 
Mendelian genetics. Given the existence of several 100 restriction 
enzymes, many such changes can be detected via the appearance or 
disappearance of a restriction enzyme cleavage site associated with 
the altered base pair, i.e., the DNA sequence GAATTC is uniquely 
recognized and cleaved by the restriction enzyme EcoRI; point 
mutations of the EcoRI sequence, such as GAATTC, AAATTC, 
TAATTC, GGATTC, GCATTC, and GTATTC, are not cleaved by 
EcoRI. Thus any of these point mutations within a particular EcoRI 
site would cause the site to disappear. In rarer instances RELPs can 
also occur within coding DNA. 

An alternative cause of RELPs involves no DNA sequence 
changes within restriction enzyme cleavage sites, but rather, the 
occurrence of repetitive DNA between two successive restriction 
sites. (Return to Topic 3 for a quick refresher course in repetitive 
DNA. ) Remember that repetitive DNA regions are very prone to 
variations in the number of reiterated elements (e.g., ... 
CACACA. . . could be repeated 30 times in a particular region of 
one chromosome and 40 times within the same region of the 
homologous chromosome; similarly, a 28-bp-long VNTR could 
occur five times in a particular region of one chromosome and 
nine times within the same region of the homologous chromo- 
some). When such variable length repetitive DNA sequences 
occur between two successive restriction enzyme cleavage sites, 
they cause the total bp number between the two sites to vary. This 
can be detected by hybridizing a unique probe to this region. 
Homozygosity for either of the RFLP fragment lengths would 
be indicated in Southern blots by a single DNA band equivalent to 
either the “long” or “short” fragment; heterozygosity would be 
indicated by a DNA band of each size. 

Restriction enzymes recognize specific nucleotide sequences 
within DNA molecules. However, the recognition specificity of 
restriction enzymes can be reduced in vitro. Under certain condi- 
tions, enzymes are able to recognize and cleave nucleotide 
sequences which differ from the canonical site. At low ionic 
strength, for example, BamHI (with the recognition sequence 
GGATCC) is able to cleave the following sequences: NGATCC, 
GPuATCC and GGNTCC. This phenomenon is called “relaxed” 
or “star” activity [4-6] (Eig. 3.1) 

In most practical applications of restriction enzymes, star 
activity is not desirable. Analysis of several reports on the star 
activity suggests the following causes for this phenomenon: 
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Fig. 3.1 Enzyme star activity. 1 — Lambda DNA, 2 — Lambda DNA incubated 1 h with 
0.15 U of EcoRi (incompiete cleavage), 3 — Lambda DNA incubated 1 h with 0.4 U of 
EcoRi (incompiete cieavage), 4 — Lambda DNA incubated 1 h with 1 U of EcoRI 
(complete, digestion), 5 — Lambda DNA incubated 1 6 h with 40 U of EcoRi (star activity), 
6 — Lambda DNA incubated 1 6 h with 70 U of EcoRi (star activity) 



1 . Prolonged incubation 

2. High enzyme concentration in the reaction mixture 

3. High glycerol concentration in the reaction mixture 

4. Presence of organic solvents, such as ethanol or dimethyl 
sulfoxide, in the reaction mixture 

5 . Low ionic strength of the reaction buffer 

6. Suboptimal pH values of the reaction buffer 

7. Substitution of Mg^^ for other divalent cations, such as Mn^^ 
or Co^^ [7] 

In some cases, the termini generated by DNA cleavage with 
a restriction enzyme at the canonical site have been shown to 
stimulate the enzyme’s star activity. Star activity and incomplete 
DNA digestion result in atypical electrophoresis patterns, which 
can be identified by careful examination of gel images {see picture 
below). Here, incomplete DNA digestion results in additional 
low-intensity bands above the expected DNA bands on the gel. 
No additional bands below the smallest expected fragment are 
observed. These additional bands disappear when the incubation 
time or amount of enzyme is increased. On the contrary, star 
activity results in additional DNA bands below the expected 
bands and no additional bands above the largest expected frag- 
ment. These additional bands become more intense with the 
increase of either the incubation time or the amount of enzyme, 
while the intensity of the expected bands decreases. Some restric- 
tion enzymes may remain associated with the substrate DNA after 
cleavage and thus change the mobility of digestion products 
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during electrophoresis. The resulting atypical pattern is not 
related to star activity (Fig. 3.1). To avoid confusing electropho- 
resis patterns, use a loading dye with SDS (e.g., the Fermentas 6x 
DNA Loading Dye & SDS Solution). Then, heat the sample for 
10 min at 65 °C and place it on ice prior to loading it on the gel. 
Any tendency of a restriction enzyme to exhibit star activity is 
indicated both in the product description and in the Certificate of 
Analysis supplied with each enzyme. 



3.2 Materials 



1. A lOx stock of the appropriate restriction enzyme buffer. 

2. DNA to be digested (see Notes 3 and 4) in either water or TE 
(10 mM Tris-HCl, pH 8.3,1 mMEDTA). 

3. Bovine serum albumin (BSA) at a concentration of 1 mg/ml. 

4. Sterile distilled water. 

5. The correct enzyme for the digest. 

6. 5x loading buffer: 50 % (v/v) glycerol, 100 mM Na 2 EDTA, 
pH 8.0, 0.125 % (w/v) bromophenol blue (6 pb), 0.125 % 
(w/v) xylene cyanol. 

7. 100 mM Sperm DNA. 



3.3 Method 



1. Thaw all solutions, with the exception of the enzyme, and 
then place on ice. 

2. Decide on a final volume for the digest, usually between 10 
and 50 pi, and then into a sterile. 

3. Eppendorf tube, add 1/10 volume of reaction buffer, 1/10 
volume of BSA, between 0.5 and 1 pg of the DNA to be 
digested, and sterile distilled water to the final volume. 

4. Take the restriction enzyme stock directly from the —20 °C 
freezer, and remove the desired units of enzyme with a clean 
sterile pipette tip. Immediately add the enzyme to the reaction 
and mix. 

5. Incubate the tube at the correct temperature (as per the 
manufacturer instructions) for ~2-6 h. Genomic DNA can 
be digested overnight. 

6. An aliquot of the reaction usually 10-12 pi may be mixed with 
a 5 X concentrated loading buffer and analyzed by gel electro- 
phoresis. 
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Table 3.1 

Single letter code 




3.4 Note 

A number of restriction enzymes discovered by Fermentas are 
isoschizomers of commonly used prototype restriction enzymes. 
Table 3.1 will help you find the appropriate Fermentas enzymes 
for your experiments. 
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Chapter 4 



Genetic Fingerprinting Techniques for Moiecuiar 
Characterisation of Microbes 

Annette Reineke and K. Uma Devi 

Abstract 

DNA fingerprints are commonly generated for a genetic characterisation of microbial populations or 
communities. The respective techniques are based either on hybridisation or on polymerase chain reaction 
(PCR). We present an overview and detailed protocols of the most frequently DNA fingerprinting 
techniques currently used in microbial ecology, including isolation of respective target sequences, set-ups 
of PCR reactions, and ways of detecting markers for generating fingerprints. 



4.1 Introduction 



Genetic or DNA fingerprinting (also called DNA profiling or 
DNA testing) is a term applied to a range of techniques that are 
used to show similarities and dissimilarities between samples of 
DNA from different individuals. Since the sequence of nucleotides 
in an individual’s DNA is as unique as its fingerprint, the term 
genetic fingerprinting was introduced in the mid-1980s initially in 
the context of assisting in correct identification of humans espe- 
cially in forensic investigations. However, DNA fingerprinting was 
quickly shown to work in all kinds of organisms from mammals to 
plants, invertebrates, and microorganisms as well. 

In microbial ecology, it is often important to have tools at hand 
for genetic characterisation of a community of interest. Such a char- 
acterisation may address all tlie microbes (tlie types and tlieir num- 
bers) in the community, or microbes of one taxonomic group 
(bacteria, fungi, algae, etc.), or microbes with a particular metabolic 
pathway, e.g. a nitrifying or denitrifying community. Community 
characterisation can help to answer questions such as has the com- 
munity changed over time or is the change in community correlated 
with any environmental parameters.^ To characterise a microbial 
community the cliche lies in the fact that only around 0.1-10 % [1] 
of microbes are cultivable. Even to retrieve the cultivable ones from a 
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sample requires trying several different culture media. Therefore, 
molecular techniques are invoked to characterise the community 
either biochemically, based on the reactions they catalyse, or geneti- 
cally, based on their DNA sequences using DNA fingerprinting 
techniques. Such techniques are also applied in cases where the 
organism of interest is cultivable and the aim of the experiment is 
to characterise the variability of that species in an ecological niche or 
to compare these organisms from different regions. The philosophy 
underlying genetic profiling of microbial communities is that tlie 
DNA of all the microbes in the sample is retrieved through the DNA 
extraction process, tiie DNA is from living organisms, and there is no 
bias in the amplification of the DNA sequence selected for any 
particular genus/species/strain. 

DNA fingerprinting techniques are basically of two lands: hybri- 
disation and PCR (Polymerase Chain Reaction) based. Restriction 
Fragment Length Polymorphism (RFLP) is a hybridisation- based 
technique. Among the several PCR-based genetic profiling methods, 
RAPD (Random Amplified Polymorphic DNA), AFLP (Amplified 
Fragment Length Polymorphism), SSCP (Single Strand Conforma- 
tion Polymorphism), and SSR/STR (Simple Sequence/Tandem 
Repeat) methods are used for studying tiie population structure of 
a particular species. The organisms used in these studies are in most 
cases cultivable. 

For analysis of variability in microbial communities, DNA is 
extracted not from cultured organisms but from a sample such as 
soil, water, abdominal content of an organism, etc. DNA isolation 
and purification kits are available for, e.g., soil or water samples. The 
most commonly used methods are ARDRA (Amplified Ribosomal 
DNA Restriction Analysis), T-RFLP (Terminal Restriction Fragment 
Length Polymorphism), and DGGE and TGGE (Denaturing/ 
Temperature Gradient Gel Electrophoresis). T-RFLP method is 
usefril for rapid analysis to assess community diversity, but ARDRA 
is more discriminative to differentiate phylotypes. 

Quite frequently, ribosomal RNA genes are used for genetic 
profiling of microbial communities. There are several reasons for 
rRNA genes to be the ones of choice: (a) they occur universally in 
all organisms, (b) they have long, highly conserved regions useful 
for loolting for distant phylogenetic relationships, (c) they have 
sufficient variable regions to assess close relationship, (d) they are 
not prone to rapid sequence change due to selection and serve as 
evolution chronometer. Due to the extensive use of rRNA genes 
in genetic profiling of communities, a database has been created 
including the patterns in different studies (http://rdp.cme.msu. 
edu). Therefore if rRNA gene profiling is done it can be readily 
compared with data in the database which would facilitate easy 
identification of the microbes. In addition, any DNA/gene 
sequence suitable for the purpose of the experiment also can be 
targeted for profiling of microbial communities (Table 4.1). 



Table 4.1 

Examples of universal genes and primer sequences targeted in various genetic profiling methods 
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The conditions of the PCR cycle would depend on the sequence 
to be amplified. The choice of the restriction enzymes is again a 
matter of trial and error. Frequent cutters (four bp recognition 
sites) are commonly used. When sequence information is avail- 
able, enzymes with the highest number of recognition sites are 
identified using software such as NEB cutter (http://tools.neb. 
com /NEB cutter2 ) . 

Random Amplified Polymorphic DNA (RAPD) fingerprint- 
ing is a low-cost yet powerful molecular typing method for organ- 
isms from bacterial species to mammals [17, 18]. RAPDs have 
become well-established genetic tools for genomic mapping and 
linkage analysis, genotype fingerprinting and identification, and 
quantification of genetic relationships, similarities, and variation in 
a variety of organisms. RAPDs are generated from genomic DNA 
by PCR amplification of only a small amount of total DNA using 
short (usually 10-mer) randomly constructed oligonucleotides. 
The advantages of this technique are the essentially unlimited 
number of loci that can be examined, no need for prior DNA 
sequence knowledge, and low amount of template DNA required. 
However, reported limitations of RAPD markers include a low 
reproducibility of banding patterns since the quality and concen- 
tration of template DNA, concentrations of PCR components, 
presence of inhibitors, or PCR cycling conditions may influence 
the results obtained [19-21]. Thus, RAPDs require a careful 
experimental set-up and careful laboratory habits. Eurthermore, 
nearly all RAPD markers behave as dominant markers; therefore, 
it is not possible to distinguish whether a specific DNA fragment 
amplified from a locus is heterozygous (present in one copy) or 
homozygous (present in two copies) in the organism analysed. 

Amplified Eragment Length Polymorphism (AELP) analysis is 
based on the restriction digestion of genomic DNA, followed by 
ligation of adapters specific to the restriction sites, and subsequent 
PCR amplification with adapter- specific primers [22]. This multi- 
locus DNA profiling technique has successfully been used for 
identifying polymorphisms in both prokaryotic and eukaryotic 
organisms and allows the reliable identification of a high number 
of loci in a single assay [23-25]. For AFLP analysis, only tiny 
amounts of purified genomic DNA are needed; it does not require 
prior loiowledge of DNA sequences, detects variation over the 
entire genome, and has proved to be robust and reliable because it 
uses stringent reaction conditions. As a consequence, it has 
become one of the most frequently used methods over the last 
1 5 years for linkage mapping, analysis of genetic diversity, popula- 
tion genetics, and single-locus PCR marker development in a wide 
variety of organisms such as bacteria, fungi, insects, plants, and 
animals. Genetic variation detected using the AFLP method is 
represented by the presence or absence of amplified DNA frag- 
ments (bands), which are classified as dominant markers. The 
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presence of a band implies that the locus is either homozygotic 
(AA: both haploids produce homologous fragments) or hetero- 
zygotic (Aa: one haploid produces a fragment), resulting in the 
inability to distinguish heterozygotes from homozygotes in AFLP 
analysis. However, the high number of polymorphic loci detected 
in AFLP analysis can compensate for this shortcoming because the 
number of loci is a crucial factor for estimating reliable population 
genetic parameters. 

Regions within DNA sequences where short sequences of 
nucleotides are repeated one after the other (in tandem arrays) 
are referred to as “microsatellites” [26]. The repeat is usually a 
short sequence consisting of two, three, or four nucleotides 
(referred to as di-, tri-, or tetranucleotide repeats, respectively). 
Therefore, they are referred to as Simple Sequence Repeats (SSRs) 
or Short Tandem Repeats (STR). The number of repeats at a 
particular location within a genome can vary between individuals 
within the same species and therefore they are also tagged as 
Variable Number Tandem Repeats (VNTR). For example, in 
some individuals the repeated unit may occur 6 times; in others it 
may be 12, or 3, or 50. Quite frequently, the number of repeats and 
the type of repeat are designated in a formula such as (GT)j 2 , 
where GT refers to the specific nucleotides that are repeated and 
the subscript (12) designates the number of times the sequence is 
repeated. In diploid organisms, each individual will have two cop- 
ies of a particular microsatellite segment, so each locus has its own 
set of microsatellite alleles. Accordingly, alleles at a specific locus 
can differ in the number of repeats, e.g. an individual may have 
eight microsatellite repeats in one allele and — in the case of het- 
erozygous individuals — ten repeats in the other allele. Polymor- 
phic microsatellite loci are ideal molecular markers, which have a 
wide range of applications including the determination of pater- 
nity, population genetic studies, and recombination mapping. 
Since microsatellite repeats tend to occur in noncoding regions 
of the DNA, they are selectively neutral. They are also usually 
inherited in a co-dominant Mendelian fashion. 

Single Strand Conformation Polymorphism (SSCP) analysis 
takes advantage of the fact that slight differences in the nucleotide 
sequence of two single -stranded DNA fragments can result in a 
different secondary structure with dissimilar migration speed dur- 
ing electrophoretic separation in a gel matrix [27-29]. Therefore, 
it is possible to detect polymorphisms of even a single base pair 
between two DNA samples. SSCP analysis offers an inexpensive, 
convenient, and sensitive method for determining genetic varia- 
tion at single or multiple locations in DNA fragments and is 
frequently used both for generating genetic markers in population 
genetic studies as well as a mutation scanning technique for diag- 
nostic purposes. It can also be used for distinguishing homozy- 
gous and heterozygous states of two alleles, since each alternative 
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Fig. 4.1 Schematic outline of ARDFiA. DNA is amplified with primers specific for the rRNA gene cluster (the 1 6S or 1 8S); amplicons 
are digested with restriction enzymes to reflect the slight differences in the sequences of the different DNA samples. 



should result in a different conformation, and thus, in distinct 
patterns after electrophoresis. However, problems with reproduc- 
ibility of the SSCP banding pattern have been reported, since 
changes in concentrations of MgCl 2 , DNA template, and PCR 
primers in the PCR reaction, as well as the choice and concentra- 
tions of chemical denaturants used before electrophoretic separa- 
tion, may significantly influence SSCP pattern formation [30]. 

For profiling of microbial communities. Amplified Ribosomal 
DNA Restriction Analysis (ARDRA) is a very simple method 
frequently used, which is based on the analysis of restriction 
enzyme digestion of the amplified IbSrRNA genes of bacterial 
species [31, 32]. For eukaryotic microbes, alternatively the 18S 
rRNA genes can be targeted (Fig. 4.1). When universal 16S rRNA 
gene primers are used, the analysis gives little or no information 
on the type of microorganisms in the sample but can be used to 
quickly assess genotypic changes in a community over time or to 
compare communities subjected to different environmental con- 
ditions. If primers specific for a particular genus are designed and 
used, this technique can be employed for molecular phylogenetic 
studies as well. Species-specific primers can be used to find markers 
for strain identification (Table 4.1). 

Terminal restriction fragment length polymorphism (T-RFLP) 
allows the fingerprinting of a microbial community by analysing the 
polymorphism of a gene. The method was first described by [33] 
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with the amplification of the 16S rDNA gene. But any suitable gene 
can be chosen (Table 4.1). It is a high-throughput, reproducible 
method that allows semiquantitative analysis of the diversity of a 
particular gene in a community [14]. 

Denaturing (DGGE) or Temperature Gradient Gel Electro- 
phoresis (TGGE) are related techniques in which allelic differ- 
ences in DNA sequences of interest are detected after PGR 
amplification based on their difference in migration on a gel 
with a denaturing gradient [34, 35]. The denaturing gradient is 
created either through difference in the chemical constitution of 
the gel with an increasing gradient of denaturant (urea and form- 
aldehyde) (DGGE) or through differences in temperature along 
the length of the gel (TGGE). These techniques are related in 
principle to the SSCP technique. The difference lies in that, in 
SSCP, migration differences in allelic forms of a DNA sequence 
are detected through difference in the conformational modes 
taken up by single-stranded DNA, while in D/TGGE, the detec- 
tion is based on mobility of double-stranded DNA of allelic forms 
of a DNA sequence. The denaturation point is species specific. 

RELP is a standard hybridisation- based DNA fingerprinting 
technique; in fact it is the first method to have been devised to 
analyse genetic diversity in a population of a species [ 36] . It can also 
serve as an identification mark (barcode) for the individual. 
It involves restriction digestion of DNA, Southern blotting, and 
hybridisation with a probe sequence. The restriction enzyme and 
probe combinations are decided based on the aim of the experi- 
ment. A single locus probe is used to analyse polymorphism in a 
single gene (a typical RELP) or a multilocus probe is used to analyse 
the overall DNA polymorphism (a DNA fingerprint) in the sample. 

This chapter presents some of the most frequently used 
methods for genetic fingerprinting and profiling of microbial 
communities including a selection of suitable protocols for isola- 
tion and application of the given molecular marker. Eor all the 
fingerprinting techniques presented, we end up with a gel or 
autoradiogram showing bands or an electropherogram showing 
peaks. The gel/autoradiogram picture is captured in a digital form 
and the banding pattern is analysed using suitable software. 



4.2 Requirements 



4.2.1. Common 
Requirements for All 
Protocols 



1. Thermocycler 

2. 0.2 ml PGR tubes or 96-well microplates 

3 . DNA in water or TE buffer 
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4.2.2. Random 
Amplified Polymorphic 
DNA 



4.2.3. Amplified 
Fragment Length 
Polymorphism 



4. dNTP mix 

5. lOx PCR buffer (usually provided with the respective polymerase) 

6. DNA polymerase 

7. Primers [different for each technique and dependent on the 
aim of the experiment in some cases (Table 4.1)] 

8. PCR product or gel purification Idt (for some applications) 

9. Agarose/polyacrylamide or premade polyacrylamide solutions 

10. lx TBE buffer: 0.1 M Tris-HCl, 0.1 M boric acid, 2 mM 
EDTA, pH 8.0 

11. lx TAE buffer: 40 mM Tris, 40 mM acetic acid, 1 mM 
EDTA, pH 7.6 with glacial CH3COOH or Tris 

12. lx TE buffer: 10 mM Tris-HCl, 1 mM EDTA, pH 8.0 

13. 6x loading dye: 15 % (w/v) Eicoll 400, 0.25 % bromophenol 
blue, lx TAE 

14. DNA size marker (in the size range of the DNA fragments 
expected) 

15. DNA staining dye like ethidium bromide/SYBR Green 

1 . RAPD primer 



1. Restriction enzymes, such as EcoKl (6-bp cutter) and Msel 
(4-bp cutter) with appropriate reaction buffers (the primer 
and adapter sequences given below will vary depending on the 
enzymes chosen). EcoRl and Msel are frequently used and are 
given as examples here. 

2. T4 DNA ligase with appropriate reaction buffer. 

3. Adapters and primers as follows: 

(a) EcoRl adapter forward: 5'-CTCGTAGACTGCGTACC-3' 

(b) EcoRl adapter reverse: 5'-AATTGGTACGCAGTCTAC-3' 

(c) Msel adapter forward: 5'-GACGATGAGTCCTGAG-3' 

(d) Msel adapter reverse: 5'-TACTGAGGACTCAT-3' 

(e) Eco-0: 5'-GACTGCGTACCAATTC-3' 

(f) MseM. 5'-GATGAGTCCTGAGTAA-3' 

(g) Eco-NN: *5'-GACTGCGTACCAATTCNN-3' 

(h) Mte-NNN: 5'-GATGAGTCCTGAGTAANNN-3' 

N in the primer sequence designates A, T, C, or G and 
can be modified in numbers from one to three selective 
nucleotides at each 3'-end of the primer. The asterisk (*) 
refers to the fact that usually the Eco- selective AETP 
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4.2.4. SSRandSTR 
(Short Tandem 
Repeats and Simple 
Sequence Repeats) 



4.2.5. SSCP (Single 
Strand Conformation 
Polymorphism) 



4.2.6. ARDRA 
(Amplified Ribosomal 
DNA Restriction 
Analysis) 



4.2.7. T-RFLP 
(Terminal Restriction 
Fragment Length 
Polymorphism) 



primer is fluorescently labelled in case an automated 
sequencer is used for electrophoretic separation. 

4. 2x Stop solution: 95 % formamide, 10 mM NaOH, 0.25 % 
bromophenol blue, filter sterilised through a 0.45 pM filter. 

5. Reagents for electrophoretic separation as required for the 
separation/detection system used. Commercially available 
precast gels can also be used. 

1 . Restriction enzyme Rsal and its suitable buffer 

2. Linker 1: 5'-GTTTAGCCTTGTAGCAGAAGC-3' 

3. Linker 2: 5'-pGCTTCTGCTACAAGGCTAAACAAAA-3' 
(p indicates phosphorylation) 

4. 10 % SDS 

5. 20x SSC buffer: 3 M NaCl, 300 mM trisodium citrate, pH 
7.0 adjusted with HCl 

6. Biotinylated repeat oligonucleotides such as b(GA) or 
b(CA)^^ 

7. Streptavidin-coated magnetic beads (e.g. Dynabeads from 
Dynal), washed according to the manufacturer’s instructions 

8 . TA cloning vector Idt 

9. Competent Escherichia coli cells 

10. One forward and one reverse primer for amplification of the 
specific target sequence 

1 . One forward and one reverse primer for amplification of the 
specific target sequence (see Table 4.1) 

2. lx formamide dye: 98 % formamide, 10 mM EDTA, 0.025 % 
bromophenol blue, 0.025 % xylene cyanol FF, pH 8.0 

1 . One forward and one reverse primer for amplification of the 
specific target sequence 

2. Commercial kit for purification of PCR fragments 

3. Specific restriction enzyme and its suitable buffer 

1. Fluorescently labelled primers (see Note 5). If both primers 
used are labelled, a different dye is used for each. The amplifi- 
cation efficiency of labelled primers tends to be lower than 
that of unlabelled primers, frequently leading to lower yields. 
It is necessary to pool several PCR reactions to obtain enough 
products for further steps (200-300 ng of DNA recom- 
mended per restriction digest). Therefore four replicate 
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4.2.8. DGGEandTGGE 
(Denaturing/ 
Temperature Gradient 
Gel Electrophoresis) 



4.2.9. RFLP 
(Restriction Fragment 
Length Polymorphism) 



4.3 Methods 

4.3.1. RAPD (Random 
Amplified Polymorphic 
DNA) 



50 )il PCR reactions are made for each sample and the ampli- 
cons are pooled up. 

2. Commercial kit for purification of PCR fragments. 

I . A forward primer with a GC clamp and a reverse primer for 
amplification of the specific target sequence. If nested PCR is 
used, the forward primer of the second PCR cycle is designed 
with the GC clamp. A GC clamp typically has a sequence like 
5' CGC CCG CCG CGC GCG GCG GGC GGG GCG GGG 
GCA CGG GGG G 3'. 

1 . Genomic DNA from individual samples 

2. Restriction enzymes 

3 . DNA size marker 

4. Probe sequence 

5 . Probe labelling kit 

6. Blotting Membrane 

7. 20 X SSC buffer: 3 M NaCl, 300 mM trisodium citrate, pH 
7.0 adjusted with HCl 

8. Hybridisation solution: 1 % SDS, 1 M NaCl, 10 % Dextran 
Sulphate, made to 1 1 with 50 mM Tris-Cl pH 7.5. The 
solution can be stored in a freezer and resuspended in a 
water bath at 65 °C before use 

9. 0.1 % SDS 

10. Salmon sperm DNA (10 pg/pl) 

I I . X-ray film 

12. X-ray developing and fixing solutions 



For generating RAPD markers, an extract of total genomic DNA 
is amplified with a single short (10-mer) primer of arbitrary nucle- 
otide sequence. These short primers anneal at random sites at a 
number of locations in the genome. For successful amplification 
to occur, the binding of the primer should have happened to 
sequences matching the primer occurring as inverted repeats 
within a distance between 50 and 3,000 bp. The higher the 
instances of sequences occurring as inverted repeats within a 
distance of 3,000 bp (the threshold length for PCR amplification) 
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Fig. 4.2 (a) Schematic outline of RAPD analysis. Genomic DNA is amplified using a single short (decamer) primer of 
arbitrary (random) sequence at a low annealing temperature and amplified products are separated on an agarose gel. 
(b) Example of banding pattern obtained after RAPD analysis of isolates of the entomopathogenic fungi Beauveria 
bassiana and Nomuraea rileyi. M designates a molecular weight marker (picture courtesy of authors). 

matching the RAPD primer in the genome of the respective 
organism, the more amplification products are generated during 
PCR (Fig. 4.2a). An example picture of a RAPD fingerprint is 
given in Fig. 4.2b. 

When starting an RAPD project, usually several factors have 
to be carefully adjusted. For instance, concentrations of reaction 
components such as MgCl 2 , primer, and dNTPs as well as the 
quality and concentration of the target DNA have to be experi- 
mentally tested for amplification efficiency. Once these parameters 
are adapted for the organism under study, PCR reactions can be 
set up. In RAPD, only one primer is used (unlike the traditional 
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two for standard PCR reactions). EAPD PCR reactions are per- 
formed at very low annealing temperature and a high number of 
PCR cycles — usually 45 cycles are applied. The low annealing 
temperature (usually 36 °C) is used to permit promiscuous pairing 
of the primer allowing also single mismatches. This is a trade-off 
to achieve a reasonable number of amplicons for assessing poly- 
morphism. At the same time, this feature of RAPD analysis is the 
chief reason for reported problems in reproducibility of the reac- 
tions. It has therefore been shown that the number of amplifica- 
tion products in a RAPD reaction is affected by the accuracy of the 
PCR machine and the fidelity of the Taq DNA polymerase used. 

RAPD amplification products are separated on a 1-2 % aga- 
rose gel, stained with ethidium bromide, silver nitrate, or other 
suitable dyes, and photographed. Variability between the indivi- 
duals is scored as the presence (1) or absence (0) of a specific band 
(amplification product) and various coefficients of genetic diver- 
sity can be calculated subsequently. 

1 . Add the following to the tubes on ice (total volume of 25 pi) 
(see Notes 1, 2, 3): 

(a) 17.8 pi sterile distilled H 2 O 

(b) 2.5 pi lOx PCR buffer (contains 15 mM MgCli) 

(c) 1.0 pi RAPD primer (5 pmol/pl) 

(d) 2.5 pi 2 mM dNTPs 

(e) 0.2 pi DNA polymerase (5 U/pl) 

(f) 1 .0 pi template DNA (diluted to ca. 10 ng/ pi) (see Note 4) 

2. Mix gently, centrifuge briefly, and place in thermocycler. 

3. PCR program 

(a) Initial denaturation at 94 °C for 2 min 

(b) Amplification at 45 cycles of 

• 94 °C for 1 min 

• 36 °C for 1 min 

• 72 °C for 2 min 

(c) Final elongation at 72 °C for 4 min 

(d) Hold reactions at 4 °C 

(e) Final elongation at 72 °C for 4 min 

(f) Hold reactions at 4 °C 

4. Electrophoretic Separation 

• Add 3.0 pi 6 X loading dye to the tubes on ice, mix, and 
centrifuge briefly. 
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4.3.2. AFLP (Amplified 
Fragment Length 
Polymorphism) 



• Load 15 pi of the reaction on a 1-2 % agarose gel in 
lx TBE or 1 X TAE buffer and separate fragments at 
80 V for ca. 1 h depending on expected size of the bands. 

Eor AELP fingerprinting, genomic DNA is generally digested with 
two restriction enzymes, one with an average cutting frequency 
(e.g. EcoRl or other 6 bp-cutters) and the other with a higher 
cutting frequency (e.g. Msel or other 4 bp-cutters). The restriction 
enzymes are chosen based on the genome size and complexity of 
the species being studied. After restriction digestion, double- 
stranded oligonucleotide adapters corresponding to the restriction 
sites are ligated to the fragments. These adapters serve as binding 
sites for adapter- specific primers in two subsequent PCR amplifi- 
cations with an aliquot of the ligation reactions as a template. 
In the first round of amplification — termed pre-amplification — 
only those restriction fragments carrying adapters specific for the 
two restriction enzymes at their 5'- and 3'-end are amplified. In 
the second round of amplification, basically the same primers are 
used, differing from the pre-amplification primers by having an 
extension of one to three selective nucleotides at their 3'end. This 
allows the amplification of only a subset of fragments from the 
initial pool of pre-amplified PCR fragments under highly stringent 
conditions. If both primers have, e.g., two selective nucleotides at 
their 3' -ends, 16 primers per site are possible with a combination 
of 256 different primers amplifying the whole population of pos- 
sible fragments. The strategy behind this approach is to reduce the 
number of amplicons and therefore bands to a reasonable number 
to make downstream analysis more practical (Pig. 4.3a). 

After amplification, AELP fragments are loaded on a denatur- 
ing polyacrylamide gel or another gel matrix with a high-resolu- 
tion capacity. Initially, AELP fragments were radioactively labelled 
either by using radioactive end labelling of one of the AELP 
primers or in the course of PCR amplification via incorporation 
labelling of radioactive nucleotides. These days, one of the AELP 
primer used during selective amplification is commonly labelled 
with a fluorophore and fluorescently labelled fragments are subse- 
quently detected during electrophoretic separation on an auto- 
mated sequencer. Using several different AELP primers each 
labelled with a different fluorophore during selective amplification 
allows the set-up of a multiplex PCR reaction, where AELP pro- 
ducts are separately detected for each fluorophore. An example of 
an AELP fingerprint based on fluorescently labelled fragments is 
shown in Pig. 4.3b. Alternatively, AELP fragments are not labelled 
during PCR, but gels are stained after electrophoresis with silver 
nitrate or any other sensitive DNA staining method like SYBRSafe 
or ethidium bromide. 
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a 

genomic DNA 

1. Digestion with 
EcoRi and Mset 



GAATTC I j TTAA 

CTTAAG ' I AATT 
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AATTC I I T 

G i AAT 



3‘ 

5‘ 



2. Adapter Ligation 



C AATTC ‘ ^ TTAC 

G TTAAG I I A AT G 



3. Preamplification 
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Amplification 

fluorescent dye 



Eco preselective primer 

AATTC < 
GTTAAG ' 



Eco + AT selective primer 

• AT+ 

C AATTC NNE 
GTTAAG NN^ 



= TTAC I 
>AATG ■ 



Mse preselective primer 



= NNTTAC 
= NNAATG 
^CA- 
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labelled AFLP 
fragment 



■ CAATTCAT 
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> GTTTAC ' 
'CAAATG ' 




Fig. 4.3 (a) Schematic outline of AFLP analysis. Genomic DNA is digested with two restriction enzymes; adapters are 
ligated to the restriction fragments which serve as primer binding sites for subsequent PCR reactions. During selective 
amplification, primers matching the adaptor sequence with additional one, two, or three nucleotides are used. (These 
primers are labelled with a fluorescent tag if the electrophoresis system used has a fluorescence detection system such 
as most capillary DNA sequencers.) The amplified products are separated on polyacrylamide gels or via capillary 
electrophoresis, (b) Example of part of a banding pattern obtained after AFLP analysis of 1 1 isolates of the entomo- 
pathogenic fungus Nomuraea rileyi. M designates a molecular weight marker. 



1 . Pipette together: 

(a) <9 pi DNA (200 ng in < 9 pi) (see Note 4) 

(b) 1.25 pi 10 X appropriate restriction enzyme buffer 
(with 100 pg/ml BSA) 

(c) 0.2 pi Ecom (20 U/pl = 5 U, NEB) 

(d) 0.3plMrri(10U/pl= 3U, NEB) 
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(e) Fill up to a total volume of 12.5 pi with sterile distilled 
H 2 O 

2. Mix gently, centrifuge briefly, and incubate at 37 °C for 2 h. 

3. Inactivate restriction enzymes by incubating 15 min at 65 °C, 
and place on ice (or hold at 4 °C). 

4. Preparation of the adapter solutions 

(a) Add forward and reverse adapters for EcoBS and Msel, in 
separate tubes from a stock solution of 100 pmol/pl in 
equal volume to one tube stock and mix them. 

(b) Heat at 65 °C for 10 min, and allow to cool at room 
temperature for 1-2 h. 

5. Add the following to the tubes (on ice): 

(a) 12.5 pi digested DNA (complete digest) 

(b) 2.5 pi lOx T4 DNALigase buffer 

(c) 1.0 pi £roRI-adapter solution (5 pmol/pl) 

(d) 1.0 pi MrrI-adapter solution (50 pmol/pl) 

(e) 0.5 pi T4 DNA Ligase (400 U/pl = 200 U) 

(f) 7.5 pi sterile distilled H 2 O 

6. Mix gently, centrifuge briefly, and incubate at 16 °C for 2 h, 
and inactivate enzyme by incubating 15 min at 65 °C. 

7. Perform a 1 :10 dilution of the ligation mixture by transferring 
10 pi of the ligation reaction to a new tube/plate and add 
90 pi TE buffer, and freeze samples. Store the unused propor- 
tion (15 pi) of the ligation reaction at —20 °C for long-term 
use. 

8. Add the following to the tubes on ice (total volume of 10 pi) 
(see Notes 1, 2, 3): 

(a) 3.8 pi sterile distilled H 2 O 

(b) 1.0 pi lOx PCR buffer (contains 15 mM MgCl 2 ) 

(c) 1.0 pi primer Eco-0 (5 pmol/pl) 

(d) 1.0 pi primer Mse-0 (5 pmol/pl) 

(e) 1.0 pl2 mMdNTPs 

(f) 0.2 pi DNA polymerase (5 U/pl) 

(g) 2.0 pi diluted ligation reaction mixture 

9. Mix gently, centrifuge briefly, and place in thermocycler. 

10. PCR program: 

(a) Amplification at 20 cycles of 



94 °C for 30 s 
56 °C for 1 min 
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• 72 °C for 1 min 

(b) Hold reactions at 4 °C 

11. Dilute an aliquot of the reaction 1:50 with sterile distilled 
H 2 O. 

12. Store reactions and diluted aliquots at —20 °C. 

13. Add the following to the tubes on ice (total volume of 11 gl) 
(see Notes 1, 2, 3): 

(a) 5.1 )tl sterile distilled H 2 O 

(b) 1.1 )tl lOx PCR buffer (contains 15 mM MgCl 2 ) 

(c) 1.0 gl primer A&c primer + N (5 pmol/gl) 

(d) 0.5 gl fluorescentiy labelled AroRl primer + NN 
(1 pmol/gl) (see Note 5) 

(e) 1.1 gl2 mMdNTPs 

(f) 0.2 gl DNA polymerase (5 U/gl) 

(g) 2.0 gl diluted pre-amplified DNA 

14. Mix gently, centrifuge briefly, and place in thermocycler. 

15. PCR program: 

(a) Ampliflcation at 12 cycles of 

• 94 °C for 30 s 

• 65 °C for 30 s, with a decrease in annealing temper- 
ature of 0.7 ° C/cycle during the following cycles 

• 72 °C for 1 min 

(b) Amplification at 23 cycles of 

• 94 °C for 30 s 

• 56 °C for 1 min 

• 72 °C for 1 min 

(c) Hold reactions at 4 °C 

16. Store reactions at 4 °C or —20 °C before electrophoretic 
separation. 

17. Add 5.0 gl 2x Stop Solution to the tubes on ice, mix, and 
centrifuge briefly. 

1 8 . Denature samples for 3 min at 94 ° C and then quickly cool on ice . 

19. Load 1.0 gl (or more, depending on gel type) on a polyacryl- 
amide gel and separate fragments according to the specifica- 
tions of the electrophoretic apparatus to be used. 

20. Analyse presence/absence of bands according to the tech- 
nique and software used. 




4 Genetic Fingerprinting Techniques for Molecular Characterisation of Microbes 



53 



4.3.3. SSRandSTR 
(Short Tandem 
Repeats and Simple 
Sequence Repeats) 



Commonly, microsatellite loci are detected in individuals by PCR, 
using primers that bind to unique sequences of the SSR flanldng 
regions. A single pair of PCR primers will therefore be useful for 
each SSR in a given species and produce different sized products 
depending on the number of microsatellite repeating units in each 
allele. It is possible in this technique to differentiate between 
homozygous and heterozygous condition of an SSR locus. The 
microsatellite primers are usually developed by inserting random 
segments of the genomic DNA of interest into a vector and 
cloning them in bacteria (Fig. 4.4a). Bacterial colonies containing 
these segments are screened with labelled oligonucleotides that 
hybridise to a given microsatellite repeat. Positive clones are 
sequenced and PCR primers are designed based on the SSR 
flanking sequences. The primers are subsequently used in standard 
PCR reactions (Fig. 4.4b) and the resulting PCR products are 
separated by either electrophoresis on gels with high-resolution 
capacity or capillary electrophoresis. The size of the PCR 
product(s) allows the determination of how many times the 
given short sequences of nucleotides is repeated for each allele. 
Sometimes not only the expected one or two major bands are 
produced during microsatellite PCR but often there are additional 
minor bands visible. These bands are called stutter bands differing 
from the major bands by a few nucleotides and are a result of 
misampliflcation of the locus during the PCR process. An example 
of an SSR fingerprint is represented in Fig. 4.4c. 

Conventionally, microsatellite loci are identified in a species of 
interest as described above using genomic libraries selected for small 
insert size and screening several thousands of clones through colony 
hybridisation with repeat containing probes. Although this is a 
relatively simple approach, this process of identification can become 
really tedious especially when working with microsatellite rich 
genomes, while ineffective, for species with low microsatellite 
frequencies. Thus, several enhanced protocols have been published 
which aim at increasing the proportion of genomic DNA fragments 
containing repeat motifs in microsatellite-enriched libraries [ 37] . In 
most of the cases, such an enrichment is accomplished by hybridis- 
ing linker-ligated genomic DNA to synthesised oligonucleotide 
repeats, which are labelled, e.g., with biotin. As biotin binds strongly 
to streptavidrn, if streptavidin-coated iron beads are subsequently 
used for hybridisation, the biotinylated oligos, along with bound 
strands of genomic DNA, will bind to the beads. After removing 
any genomic DNA not bound to these oligos, a repeat-enriched 
genomic DNA solution is obtained and is used for furdier cloning 
and analysis. The following protocol includes an enrichment 
procedure and is based on the methods published by [38, 39]. 
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Fig. 4.4 (a) Steps during isolation of SSR markers and generation of an SSR-enriched library, (b) Schematic outline of SSR 
analysis. Genomic DNA is amplified with primers flanking the previously identified SSRs and obtained fragments are 
separated via electrophoresis, (c) Example of a banding pattern obtained after SSR analysis of six individuals of a 
lepidopteran insect, Lobesia botrana. M designates a molecular weight marker (picture courtesy of authors). 
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1 . Pipette together: 

(a) <44 pi genomic DNA (4-10 pg DNA in <44 pi) 

(b) 5.0 pi Rsal 10 X restriction buffer 

(c) 2.0 pi ilM (20 U/pl) 

(d) Fill up to a total volume of 50 pi with sterile distilled 
H 2 O 

2. Mix gently, centrifuge briefly, and incubate at 37 °C for 2 h or 
longer. 

3. Inactivate restriction enzymes by incubating 15 min at 65 °C, 
and place on ice (or hold at 4 °C). 

4. Separate fragments on a 1 % agarose gel, and gel purify frag- 
ments between ca. 300 and 1,000 bp using a commercial gel 
purification kit according to the manufacturer’s instructions. 

5. Preparation of double-stranded linker solutions 

(a) Add each 10 pM forward and reverse linkers (linker 1 and 
linker 2) in equal volumes to one tube stock and mix 
them; this makes a 5 pM double -stranded linker stock 
solution. 

(b) Heat at 65 °C for 10 min, and allow to cool at room 
temperature for 1-2 h. 

6. Add the following to the tubes on ice (total volume of 30 pi): 

(a) 10.0 pi digested, size-selected, and gel-purified DNA 
(2-5 pg) 

(b) 3.0 pi 10 X T4 DNA Ligase buffer 

(c) 10.0 pi double-stranded linker solution (5 pM) 

(d) 2.0 pi T4 DNA Ligase (400 U/pl) 

(e) 1.0 pi restriction enzyme Xmnl (20 U/pl)* 

(f) 4.0 pi sterile distilled H 2 O 

*Note: During the ligation/restriction reaction, both 
linkers will form an Xmnl restriction site. Including 
Xmnl in the ligation reaction will cut these dimers 
apart, which will maintain a large pool of monomer 
linkers for ligation to genomic DNA fragments. 

7. Mix gendy, centrifuge briefly, and incubate at 16 °C for 2 h; 
inactivate enzymes by incubating 15 min at 65 °C. 

8. To ensure that the linkers had ligated to the genomic DNA, 
perform a PCR with the ligated DNA: 

9. Add the following to the tubes on ice (total volume of 20 pi): 

(a) 12.8 pi sterile distilled H 2 O 
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(b) 2.0 )il 10 X PCR buffer (contains 15 mM MgCl 2 ) 

(c) 1.0 )tl linker 1 (0.8 gM final concentration) 

(d) 2.0 gl 2 mM dNTPs 

(e) 0.2 )tl DNA polymerase (5 U/|tl) 

(f) 2.0 pi template DNA (ligated genomic DNA) 

10. Mix gently, centrifuge briefly, and place in thermocycler. 

11. PCR program: 

(a) Initial denaturation at 94 °C for 2 min 

(b) Amplification at 30 cycles of 

• 94 °C for 30 s 

• 60 °C for 1 min 

• 68 °C for 1 min 

(c) Final elongation at 68 °C for 5 min 

12. Run products on 1 % agarose gel; a smear is expected within 
the size range of 200-1,000 bp if linkers have successfully 
ligated. 

13. Pipette together for a total volume of 200 pi: 

(a) 2.0 pi linker digated genomic DNA (50 ng/pl) 

(b) 100.0 pi 2x hybridisation solution (12 X SSC, 0.1%SDS) 

(c) 2.0 pi biotinylated repeat oligonucleotide (20 nM) 

(d) Fill up to a total volume of 200 pi with sterile distilled 
H 2 O 

14. Mix gently, centrifuge briefly, denature for 15 min at 95 °C, 
and incubate at 60 °C (or other hybridisation temperature, 
depending on the of the repeat oligo) overnight. 

15. Mix with 600 pg of streptavidin-coated magnetic beads. 

16. Incubate at 43 °C for 2 h with continuous agitation. 

17. Wash beads twice in 2x SSC, 0.1 % SDS at room temperature 
for 5 min*. 

18. Wash beads twice in lx SSC, 0.1 % SDS at 60 °C for 5 min*. 

*Note: Wash temperatures can be adjusted to increase 
(hotter) or decrease (cooler) hybridisation stringency. 

19. Elute repeat-enriched genomic DNA from the beads with 
60 pi preheated TE buffer by incubating at 95 °C for 
10 min and immediately recovering the eluate. 

20. Add the following to the tubes on ice (total volume of 50 pi) 
(see Notes 1, 2, 3): 

(a) 12.8 pi sterile distilled H 2 O 




4 Genetic Fingerprinting Techniques for Molecular Characterisation of Microbes 



57 



(b) 5.0 pi lOx PCR buffer (contains 15 mM MgCl 2 ) 

(c) 1.0 pi Linker 1 (0.8 pM final concentration) 

(d) 2.0 pi 2 mM dNTPs 

(e) 0.2 pi DNA polymerase (5 U/pl) 

(f) 5.0 pi template DNA (repeat-enriched DNA) 

21. Mix gently, centrifuge briefly, and place in thermocycler. 

22. PCR program: 

(a) Initial denaturation at 94 °C for 2 min 

(b) Amplification at 25-30 cycles of 

• 94 °C for 30s 

• 60 °C for 1 min 

• 68 °C for 1 min 

(c) Final elongation at 68 °C for 7 min 

23. Run a 5 pi aliquot of the amplified enriched DNA on a 1 % 
agarose gel to check amplification. 

24. Purify the PCR reaction with a commercial PCR purification 
Idt. 

25. Adenylate the 3'-ends of the PCR products by adding the 
following to the tubes on ice (total volume of 5 pi): 

(a) 1.5 pi sterile distilled H 2 O 

(b) 0.5 pi lOx PCR buffer (contains 15 mM MgCl 2 ) 

(c) 1.0pl2mMdATP 

(d) 1.0 pi DNA polymerase (5 U/pl) 

(e) 1.0 pi purified PCR products (150-200 ng/pl) 

26. Mix gently, centrifuge briefly, place in thermocycler, and incu- 
bate at 70 °C for 30 min. 

27. Ligate adenylated PCR products into a TA cloning vector 
according to the instructions provided by the supplier and 
using standard procedures for transforming competent cells 
and isolation of plasmid DNA. 

28. Sequence plasmid DNA from positive recombinant clones 
using standard sequencing procedures. 

29. Design primers based on the sequences flanking the microsat- 
ellite arrays identified within the sequenced inserts. 

30. Primers must be tested and optimised in order to ensure 
faithful and consistent amplification. 

31. Add the following to the tubes on ice (total volume of 15 pi) 
(see Notes 1, 2, 3): 
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(a) 8.9 )il sterile distilled H 2 O 

(b) 1.5 )tl lOx PCR buffer (contains 15 mM MgCl 2 ) 

(c) 1 .0 )tl fluorescently labelled forward primer (4 pmol/ pi) 
(see Note 5) 

(d) 1.0 pi reverse primer (10 pmol/pl) 

(e) 1.5 pl2 mMdNTPs 

(f) 0.1 pi DNA polymerase (5 U/pl) 

(g) 1.0 pi template DNA (diluted to ca. 10-50 ng/pl) 
(see Note 4) 

32. Mix gently, centrifuge briefly, and place in thermocycler. 

33. PCR program; 

(a) Initial denaturation at 94 °C for 2 min 

(b) Amplification at 20 cycles of 

• 94 °C for 30 s 

• 65 °C for 30 s, with a decrease in annealing temper- 
ature of 0.5 ° C/cycle during the following cycles 

• 72 °C for 30 s 

(c) Amplification at 15 cycles of 

• 94 °C for 30 s 

• 55 °C for 30 s (see Note 6) 

• 72 °C for 30 s 

(d) Final elongation at 72 °C for 3 min 

(e) Hold reactions at 4 °C 

34. Store reactions at —20 °C. 

35. The way of preparing the samples for electrophoretic separa- 
tion strongly depends on the machinery that is used for this 
purpose. A polyacrylamide gel matrix or a capillary electro- 
phoresis system is usually used for this purpose and separation 
takes place according to the specifications of the electropho- 
retic apparatus that is used. 

4.3.4. SSCP (Single The SSCP method relies on the fact that the mobility of single- 
Strand Conformation stranded DNA fragments in a non-denaturing gel is distinctly 
Polymorphism) affected by very small changes in nucleic acid sequence. These 

small differences become evident because of the relatively 
unstable nature of single-stranded DNA: If single DNA strands 
are denatured and subsequendy renatured, they undergo a three- 
dimensional folding, where intra-strand base pairing may occur, 
resulting in loops and folds that give the single DNA strand a 
unique conformation. This in turn will noticeably affect the 
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Fig. 4.5 (a) Schematic outline of SSCP analysis. Genomic DNA is amplified with gene-specific primers; amplicons are 
denatured and renatured prior to electrophoresis. The amplicons with slight differences in sequence take a different 
conformation when they renature resulting in differences in their migration speed in an electrophoretic gel. (b) Example of a 
banding pattern obtained after SSCP analysis of eight samples of the entomopathogenic fungus Beauveria bassiana based 
on amplification of part of the beta tubulin gene. M designates a molecular weight marker (picture courtesy of authors). 



migration behaviour of the single-stranded DNA fragments since 
DNA molecules with a different folded structure may migrate 
faster or slower in a gel matrix depending on their 3D conforma- 
tion (Fig. 4.5a). SSCP analysis typically begins with the amplifica- 
tion of a fragment of Icnown length in a PCR reaction using 
specific primers. Some examples of primers commonly used for 
SSCP analysis are given in Table 4.1. Usually, the length of the 
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amplification product should be between 100 and 400 bp, since it 
has been shown that single base pair differences are more accu- 
rately resolved for smaller fragments. However, occasionally also 
fragments of up to 700 or 800 bp have been analysed successfully. 
After PCR, amplified DNA fragments are subsequently denatured 
usually by adding a denaturing agent such as formamide, DMSO, 
or NaOH to an aliquot of the PCRreaction and heating it at 95 °C 
for 5 min, followed by snap chilling the samples on ice. Electro- 
phoretic separation takes place in a non- denaturing gel, either 
based on polyacrylamide or a related gel matrix. During electro- 
phoretic separation, a low constant temperature below 10 °C 
should be maintained, which is usually achieved by connecting 
the electrophoresis apparatus to a circulating water bath, which 
automatically makes adjustments in the temperature of the circu- 
lating water in order to keep the buffer temperature at a desired 
level during electrophoresis. SSCP patterns are detected in the gel 
either by staining the gel in staining agents such as SYBR Gold, 
ethidium bromide, or silver nitrate or, alternatively, by using an 
isotopic SSCP protocol. A picture of an SSCP profiling is shown in 
Fig. 4.5b. Frequently, SSCP products are also subsequently 
sequenced to locate nucleotide differences between samples in 
the respective PCR product. 

1 . Add the following to the tubes on ice (total volume of 50 pi) 
(see Notes 1, 2, 3): 

(a) 37.3 pi sterile distilled H 2 O 

(b) 5.0 pi 10 X PCR buffer (contains 15 mM MgCl 2 ) 

(c) 2.0pl forward primer (20 pmol/pl) 

(d) 2.0 pi reverse primer (20 pmol/pl) 

(e) 1.6pl2 mMdNTPs 

(f) 0.1 pi DNA polymerase (5 U/pl) 

(g) 2.0 pi template DNA (diluted to ca. 20 ng/pl) (see Note 4) 

2. Mix gently, centrifuge briefly, and place in thermocycler. 

3. PCR program: 

(a) Initial denaturation at 94 °C for 30 s 

(b) Amplification at 30 cycles of 

• 94 °C for 30 s 

• 53 °C for 30 s (see Note 6) 

• 72 °C for 1 min 

(c) Final elongation at 72 °C for 5 min 

(d) Hold reactions at 4 °C 

4. Store reactions at —20 °C. 
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5. Add 10.0 pi 1 X formamide dye to the tubes on ice, mix, and 
centrifuge briefly. 

6. Denature samples for 3 min at 95 °C, quickly chill on ice. 

7. Load 10-15 pi of the reaction on a polyacrylamide or other 
non-denaturing gel matrix and separate fragments according 
to the specifications of the electrophoretic apparatus that is 
used. Keep temperature constant during the separation at 
7-9 °C depending on the requirements of the specific ampli- 
fication products. 



4.3.5. ARDRA 
(Amplified Ribosomal 
DMA Restriction 
Analysis) 



The procedure briefly involves the following steps: 

1 . PCR primers are chosen based on the aim of the experiment. 
PCR reaction is set up based on the base composition of the 
primers and the length of the amplicon. 

2. Restriction enzyme digestion. 

3. Separation of products of the restriction digest on an agarose 
or polyacrylamide gel. 



The detailed procedure includes 

1. Add the following to the tubes on ice (total volume of 50 pi) 
(see Notes 1, 2, 3): 

(a) 37.3 pi sterile distilled H 2 O 

(b) 5.0 pi lOx PCR buffer (contains 15 mM MgCl 2 ) 

(c) 2.0 pi forward primer (20 pmol/pl) 

(d) 2.0 pi reverse primer (20 pmol/pl) 

(e) 5.0 pl2 mMdNTPs 

(f) 0.1 pi DNA polymerase (5 U/pl) 

(g) 2.0 pi template DNA (diluted to ca. 20 ng/ pi) (see Note 4) 

2. Mix gently, centrifuge briefly, and place in thermocycler. 

3. PCR program: 

(a) Initial denaturation at 94 °C for 5 min 

(b) Amplification at 30 cycles of 

• 94 °C for 30s 

• 53 °C for 30 s (see Note 6) 

• 72 °C for 2 min 

(c) Final elongation at 72 °C for 5 min 

(d) Hold reactions at 4 °C 

4. Store reactions at —20 °C. 
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5. Run a 5 aliquot of the amplified DNA with 3.0 pi 6x 
loading dye on a 1 % agarose gel to check amplification. 

6. Load 4 pi of a DNA size marker. 

7. Separate fragments at 80 V for ca. 1 h. 

8 . If amplification is successful, purify the PCR reaction mixture 
with a commercial PCR purification. 

9. Pipette together: 

(a) <43 pi of the PCR product 

(b) 5.0 pi 10 X restriction buffer appropriate for restriction 
enzyme (chosen)* 

(c) 2.0 pi restriction enzyme (20 U/pl) 

(d) Fill up to a total volume of 50 pi with sterile distilled 
H2O 

10. Mix gently, centrifuge briefly, and incubate at the temperature 
and time recommended for the restriction enzyme used. 

11. Inactivate restriction enzymes by incubating 15 min at 65 °C, 
and place on ice (or hold at 4 °C). 

12. Separate fragments on a 2 % agarose or a polyacrylamide gel. 

13. Document the image for analysis. 

* Restriction enzyme with highest number of recognition 
sequences in the sequence amplified is chosen. When the 
sequence is laiown, restriction enzymes can be identified 
with a software (such as NEBcutter at http://tools.neb. 
com /NEBcutter2 ) . 



4.3.6. T-RFLP 
(Terminal Restriction 
Fragment Length 
Polymorphism) 



This profiling technique is based on the position of a restriction 
site closest to a labelled end of an amplified gene. It is a useful 
modification of ARDRA technique in that only one (terminal) 
fragment is observed while several fragments are observed in 
ARDRA technique. The amplification on a DNA sample is per- 
formed with one or both the primers having their 5' end labelled 
with a fluorescent molecule. When both primers are labelled, 
different coloured fluorescent dyes are required. The fluorescent 
dyes usually used are 6-FAM, ROX, TAMARA, and HEX. The 
most commonly used dye is 6-FAM. The PCR products are 
digested with a restriction enzyme (a 4 bp-cutter) and the frag- 
ments are separated using either capillary or polyacrylamide elec- 
trophoresis in a DNA sequencer. Only the fluorescently labelled 
terminal fragments are visible on a sequencer, while the rest of the 
fragments is not seen (Fig. 4.6). The T-RFLP profile is a graph 
(electropherogram), with the X-axis representing the size of the 
fragments and the T-axis the intensity of the fluorescence of each 
fragment. What appears on an electrophoresis gel as a band is 
represented as a peak on the electropherogram. It is assumed 
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Fig. 4.6 Schematic outline of T-RFLP. Sample DNA is amplified with gene-specific primers, one of the primers being 
fluorescently labelled. The amplified products are digested with different restriction enzymes and the terminal fragments 
are detected due to the fluorescent tag in the form of a peak on the electropherogram emerging in an automatic DNA 
sequencer. 



that in a T-RFLP profile, each peak corresponds to one genetic 
variant in the original sample and its height/area to its relative 
abundance in that community. But these assumptions are not 
always correct. Often, several different microbes in the population 
may have a restriction site for the restriction enzyme used in the 
experiment at the same position and thus give a single peak on the 
electropherogram. To overcome this problem and to increase the 
resolving power of this technique, aliquots of the PCR product 
can be digested simultaneously with many (often three) enzymes 
resulting in many T-RFLP profiles for a single sample. Each of 
these profiles can resolve some variants while missing others. The 
resolving power of the T-RFLP profiles can be further enhanced 
by fluorescently labelling the reverse primer also with a different 
dye. This way, two parallel profiles can be generated per sample 
each resolving a different number of variants. Usually the PCR 
products are treated with an exonuclease like mung bean nuclease 
before proceeding for restriction enzyme digestion to prevent the 
formation of pseudo-T-RFLPs (arising from random pairing of 
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ssDNA in the PCR products). Replicate experiments are done to 
recognise the false background (noise) peaks. 

1. Add the following to the tubes on ice (total volume of 50 pi) 
(see Notes 1, 2, 3): 

(a) 5.0 pi lOx PCR buffer (contains 15 mM MgCli) 

(b) 2.0 pi forward primer (5 pmol/pl) (see Note 5) 

(c) 2.0 pi forward primer (5 pmol/pl) 

(d) 5.0 pl2 mMdNTPs 

(e) 32.8 pi sterile distilled H 2 O 

(f) 0.2 pi DNA polymerase (5 U/pl) 

(g) 1.0 pi template DNA (diluted to ca. 10 ng/ pi) (see Note 4) 

2. Mix gently, centrifuge briefly, and place in thermocycler. 

3. PCR program: 

(a) Initial denaturation at 94 °C for 2 min 

(b) Amplification at 45 cycles of 

• 94 °C for 1 min 

• 50 °C for 1 min (see Note 6) 

• 72 °C for 2 min 

(c) Final elongation at 72 °C for 4 min 

(d) Hold reactions at 4 °C 

4. To check for amplification, run 5 pi of the PCR reaction mix 
on a 1.5 % agarose gel at 80 V for 1 h. If amplification looks 
good, proceed to next step. 

5. Clean the PCR products with a suitable kit. 

6. Concentrate the pooled PCR reactions to a fifth of the original 
volume using a Speedvac or ethanol precipitation. 

7. Pipette together for a 50 pi volume 

(a) “x vol” pi DNA (with 100-150 ng pg DNA 1) 

(b) 5.0 pi 10 X restriction enzyme buffer 

(c) 2.0 pi restriction enzyme* (20 U/pl) 

(d) Fill up to a total volume of 50 pi with sterile distilled 
H 2 O 

8 . Mix gently, centrifuge briefly, and incubate at the temperature 
and time recommended for the respective restriction enzyme. 

9. Inactivate restriction enzymes by incubating 15 min at 65 °C, 
and place on ice (or hold at 4 °C). 

‘Various restriction enzymes can be used in single-enzyme 
reactions in order to determine which one yields the highest 
number and most even distribution of terminal restriction frag- 
ments. Alternatively if the sequence of the amplified product is 
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4.3.7. DGGEandTGGE 
(Denaturing/ 
Temperature Gradient 
Gel Electrophoresis) 



known, software (such as NEBcutter at http:/ /tools. neb. com/ 
NEBcutter2) can be used to choose tlie enzymes. 

10. Load an aliquot of the restriction digest adding appropriate 
volume of loading dye on a polyacrylamide gel. The polyacryl- 
amide gel is cast as per the instructions given for the auto- 
matic DNA sequencer being used. In case capillary 
electrophoresis is used, a desalting step is also done before 
running the sample. The run is carried out at temperature and 
time recommended by the manufacturer of the automatic 
DNA sequencer. 

11. The electropherogram is analysed using appropriate software. 
The analysis can be complemented with a clone library to 
validate the peaks in the electropherogram and to assess rela- 
tive abundance of each variant in the library. 

A given sequence of DNA from different individuals within a 
species may have small differences in sequence. These small 
(even a single nucleotide) differences can result in a change in 
the melting point of the sequence. Melting temperature of a DNA 
molecule is a function of its GC content — the more it is, the 
higher the denaturing temperature. In DGGE and TGGE, 
a specific region of DNA is amplified through PGR. The forward 
primer of the region to be amplified has a 40 nucleotide (GC) long 
sequence referred to as the GC clamp. When the PGR products 
from slightly different templates are run on a denaturing gradient 
gel, the migration of the DNA molecule stops when it is dena- 
tured. The GC clamp at one end of the amplified DNA prevents 
the denatured strands from coming apart. Thus PGR products of 
the same size but from different individuals within a species or 
different species (and therefore with differences in sequence) end 
up at different places on the gel (Pig. 4.7). If species-specific 
probes are available, individual bands on the gel can be identified 
by subsequent Southern blotting and hybridisation. Very com- 
monly the conserved regions of the ribosomal genes (16S rRNA, 
18S rRNA, or ITS; see Table 4.1) are targeted for PGR. These 
techniques are not suitable when the sequence is exceptionally GC 
rich and when the length of it is more than 400 bp. 

I. Add the following to the tubes on ice (total volume of 50 pi) 
(see Notes I, 2, 3): 

(a) 37.3 pi sterile distilled H 2 O 

(b) 5.0 pi 10 X PGR buffer (contains 15 mM MgCli) 

(c) 2.0 pi forward primer (20 pmol/pl) 

(d) 2.0 pi reverse primer (20 pmol/pl) 

(e) 5.0 pl2 mMdNTPs 

(f) 0.2 pi DNA polymerase (5 U/pl) 

(g) 2.0 pi template DNA (diluted to ca. 20 ng/ pi) (see Note 4) 
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Fig. 4.7 Schematic outline of D/TGGE. Sample DNA is amplified with gene-specific primers; one of the primers has a ca. 
40 bp long GC tail. The amplified products are run on a gel with a denaturing gradient (created by the gel constitution or 
through temperature). The migrating DNA molecules denature when subjected to the denaturing condition in the gel. The 
site on the gel where they completely denature/melt (except for the GC clamp) is a property of their sequence. The DNA 
molecules stop migration when they have melted completely. Slight differences in sequence result in differences in the 
melting point of the DNA molecules and hence the distance they travel on a gel with denaturing gradient. 



2. Mix gently, centrifuge briefly, and place in thermocycler 

3. PCR program: 

(a) Initial denaturation at 94 °C for 30 s 

(b) Amplification at 30 cycles of 

• 94 °C for 30s 

• 53 °C for 30 s (see Note 6) 

• 72 °C for 1 min 

(c) Final elongation at 72 °C for 5 min 

(d) Hold reactions at 4 °C 

4. Store reactions at —20 °C 

5. Check for amplification by running 5 pi on a 1 % agarose gel. 

6. For DGGE, prepare the polyacrylamide gel with [8 % (w/v) 
acrylamide stock solutions (acrylamide/bisacrylamide ratio of 
37.5:1) in lx TAE buffer (pH 8.0)] with a denaturing gradi- 
ent of 35-60 % [100 % denaturant contains 7 M urea and 40 % 
(v/v) formamide]. Maintain the pour rate of the gel at 4 ml/ 
min for an 18 x 16 cm gel. Add 3 ml of stacldng polyacryl- 
amide gel with no denaturant after the denaturing gel poly- 
merised for 10 min. 
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4.3.8. RFLP 
(Restriction Fragment 
Length Potymorphism) 



7. For TGGE, prepare the polyacrylamide gel as per the instruc- 
tions of the apparatus manufacturer. The apparatus has provi- 
sion to create a temperature gradient with increasing gel from 
the wells to the bottom. 

8. Load 10-15 pi of the reaction on a polyacrylamide gel and 
separate fragments according to the specifications of the elec- 
trophoretic apparatus. 

9. Stain the gel for 20 min with ethidium bromide or SYBR 
Green. Wash the gel twice for 5 min with Milli-Q water and 
view on a UV transilluminator to capture the picture in digital 
form. 

10. Analyse the gel picture according to the aim of the experiment 
using an appropriate software. The analysis can be supplemen- 
ted by further sequencing of the alleles identified in gels. 

Due to spontaneous changes in DNA sequence, restriction enzyme 
sites are eitlier created or lost in different regions of the genome. In a 
given lengtli of DNA, die difference in tlie position of the restriction 
enzymes can be recognised when the restriction digest of DNA 
is hybridised by a sequence matching that sequence (Fig. 4.8a). 
An example picture of a RFLP fingerprint is given in Fig. 4.8b. 

1. Pipette together in a 20 pi volume 

(a) “x vol” pi DNA (3-5 pg DNA) 

(b) 5.0 pi 10 X restriction enzyme buffer 

(c) 2.0 pi restriction enzyme (20 U/pl) 

(d) Fill up to a total volume of 20 pi with sterile distilled 

H2O 

2. Mix gently, centrifuge briefly, and incubate at the temperature 
and time recommended for the enzyme. It would be ideal to 
leave digestion overnight. 

3. Inactivate restriction enzymes by incubating 15 min at 65 °C, 
and place on ice (or hold at 4 °C). 

4. Cast a 0.6-1 % (depending on the expected size range of the 
fragments) agarose gel. Use wide thin combs to accommodate 
a large volume of the sample. 

5. Load the restriction digest (maximum volume that can fit in 
the well) with 3.0 pi 6x loading dye. 

6. Load 4 pi of DNA size marker (choose a marker suitable for 
the expected size of the fragments). 

7. Run the gel overnight at 70 V (about 16 h) in lx TBE 
running buffer. Capture the picture of the gel on a UV trans- 
illuminator putting a ruler beside the gel in order to estimate 
the distance run by the size marker. 

8. To denature DNA in the gel, place it in 0.25 M HCl for 
20 min. 
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Fig. 4.8 (a) Schematic outline of RFLP. Genomic DNA is digested with a restriction enzyme, restricted products are 
separated on a gel through electrophoresis, and the separated fragments of DNA on the gel are southern blotted to a 
membrane and hybridised with a labelled probe sequence. The region spanned by the probe may be in one, two, or three 
fragments depending on the number of recognition sites of the restriction enzyme used. The fragments that hybridise 
with the probe light up because of the label and the polymorphism in the locus represented by the probe DNA in a 
population can be detected, (b) Example of an RFLP used for generating telomere fingerprints. RFLP fingerprints were 
generated from the £coRI-digested genomic DNA of the entomopathogenic fungi Nomuraea r//ey/and Beauveria bassiana 
using a p32 end labelled oligonucleotide of telomere repeat sequence (5'-TTTAGG-3')4 as a probe. Size of molecular 
weight marker bands is given at the left. 
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9. Meanwhile set up the vacuum blot apparatus. 

10. Wash the gel with millipore water and place it adjacent to the 
membrane in blot apparatus. Leave the set-up for 1 h, and 
check from time to time that there is no leakage. If there is a 
small leakage, add more NaOH to maintain the gel always 
submersed. At the end, suck out NaOH. Mark the positions 
of the wells on the membrane with a pencil.* 

11. Wash the membrane with 2x SSC to clean the agarose. 

12. Expose the membrane to UV light in a UV fixer apparatus 
(for the time recommended). 

1 3 . Air dry it and wrap with an aluminium foil and store in a — 2 0 ° C . 

* If a vacuum blotting apparatus is not available, capillary 
transfer can be made according to [36]. 

14. Label the probe DNA (25-50 ng) using a method of choice 
(end or strand labelling and radio- or nonradioactive label) as 
per the instructions of the labelling Idt. 

If the membrane that is used is employed for the first time, an 
overnight pre-hybridisation is needed, or else 4-5 h is suffi- 
cient. 

15. Wet the membrane with either distilled water or 2x SSC. 
Drain off the excess water or SSC. Roll the membrane into 
the hybridisation tube. Pour 20 ml of the pre-hybridisation 
solution (see below) along with 600 pg of sonicated and 
denatured (boiled for 5 min) salmon sperm DNA (10 pg/p 
1) and incubate in the rotating hybridisation oven at 65 °C. 
After pre-hybridisation, start hybridisation by adding to the 
pre-hybridisation solution the labelled probe (denatured by 
boiling at 100 °C for 10 min and placing on ice immediately). 
Incubate overnight. 

If radioactive label is used, pour out all the wash solutions in a 
container meant for disposal of radioactive waste. The strin- 
gency (concentration of the wash solution and temperature of 
incubation of the remaining washings) is decided based on the 
base composition of the probe. A general protocol is given here. 

16. Place the hybridised membrane in a plastic box. Pour 400 ml 
of 2x SSC into it. 

17. Place it on a shaker and shake at medium speed at room 
temperature for 10 min. Pour off the solution. 

18. Add 500 ml of preheated (65 °C) lx SSC -F 0.1 % SDS and 
incubate on a slow shaking platform for 20 min at 65 °C. Pour 
off the solution and replace with preheated (65 °C) 0.2x 
SSC -F 0.1 % SDS and incubate with shaking for 20 min at 
65 °C. If radioactive label is used, measure the radioactivity 
level on the membrane with a Geiger Muller counter. If the 
count is more than 5 cpm, wash the membrane again at 65 °C 
with 0.1 X SSC -FO.1 % SDS for 20 min. 
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19. Dry the membrane and expose to an X-ray film. Store the 
X-ray cassette in a freezer at —80 °C. 

20. Develop the X-ray film. 



4.4 Notes 



4.4.1. Some General 
Considerations for 
PCR-BasedDNA 
Fingerprinting 
Methods 

master mix components by gentle shaking and spin the con- 
tents down through pulse centrifuging. Dispense aliquots of 
the master mix into PCRtubes/wells (placed on ice) and then 
add the respective template DNA. Place the tubes or wells in 
the thermocycler and proceed with the PCR program. 

2. Watch out to close tubes or plates properly to prevent evapo- 
ration! 

3. Always include a negative control (with no DNA) in the PCR 
reactions. 

4. Test reproducibility of PCR reactions by using at least two 
independently isolated DNA templates from each sample and 
repeating some of the PCR reactions if possible using differ- 
ent thermocyclers. 

5. If fluorescently labelled primers are used (e.g. for AFLP anal- 
ysis), cover them and subsequent reactions with foil as much 
as possible to avoid prolonged exposure to light! 

6. Most of the annealing temperatures presented in the above 
protocols are just an example; use an appropriate temperature 
according to the specifications of the primers used. 



1 . For setting up PCR reactions, prepare the master mix (minus 
DNA) of a volume sufficient for all reactions. Add the com- 
ponents (dNTP mixture, primers, PCR buffer, and water) of 
the master mix to a tube placed on ice. Always add Taq DNA 
polymerase as the last component to the master mix. Mix the 
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Chapter 5 



Agarose Gel Electrophoresis and Polyacrylamide 
Gel Electrophoresis: Methods and Principles 

Sukutnar Mesapogu, Chandra Mouleswararao Jillepalli, 
and Dilip K. Arora 

Abstract 

Electrophoresis is a technique used to separate and sometimes purify macromolecules — especially proteins 
and nucleic acids — that differ in size, charge, or conformation. When charged molecules are placed in an 
electric field, they migrate toward either the positive or negative pole according to their charge. In 
contrast to proteins, which can have either a net positive or net negative charge, nucleic acids have a 
consistent negative charge imparted by their phosphate backbone and migrate toward the anode. Proteins 
and nucleic acids are electrophoresed within a matrix or gel. The gel is immersed within an electrophoresis 
buffer that provides ions to carry a current and some type of buffer to maintain the pH at a relatively 
constant value. Agarose is typically used at concentrations of 0.5-2 %. Agarose gels have a large range of 
separation, but relatively low resolving power. By varying the concentration of agarose, fragments of DNA 
from about 200 to 50,000 bp can be separated using standard electrophoretic techniques. SDS PAGE uses 
an anionic detergent (SDS) to denature proteins and the protein molecules become linearized. One SDS 
molecule binds to two amino acids. Due to this, the charge to mass ratio of all the denatured proteins 
in the mixture becomes constant. These protein molecules move in the gel (toward the anode) on the 
basis of their molecular weights only and are separated. The polyacrylamide chains are cross linked by 
A,W-methylene bisacrylamide comonomers. Polymerization is initiated by ammonium persulfate (radical 
source) and catalyzed by TEMED. 



5.1 Introduction 



Electrophoresis is a procedure which enables the sorting of mole- 
cules based on size and charge. Using an electric field, molecules 
(such as DNA) can be made to move through a gel made of agar or 
polyacrylamide. Nucleic acid molecules are separated by applying 
an electric field to move the negatively charged molecules through 
an agarose matrix. Shorter molecules move faster and migrate 
farther than longer ones because shorter molecules migrate 
more easily through the pores of the gel. This phenomenon is 
called sieving. Proteins are separated by charge in agarose because 
the pores of the gel are too large to sieve proteins. Gel electro- 
phoresis can also be used for separation of nanoparticles [ 1 ] . 
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The term “gel” in this instance refers to the matrix used to 
contain and then separate the target molecules. In most cases, the 
gel is a cross-linked polymer whose composition and porosity is 
chosen based on die specific weight and composition of the target 
to be analyzed. When separating proteins or small nucleic acids 
(DNA, RNA, or oligonucleotides) the gel is usually composed of 
different concentrations of acrylamide and a cross-linker, producing 
different sized mesh networks of polyacrylamide. When separating 
larger nucleic acids (greater than a few hundred bases), the preferred 
matrix is purified agarose. In both cases, the gel forms a solid, yet 
porous matrix. Acrylamide, in contrast to polyacrylamide, is a 
neurotoxin and must be handled using appropriate safety precau- 
tions to avoid poisoning. Agarose is composed of long unbranched 
chains of uncharged carbohydrate without cross-links resulting in a 
gel with large pores allowing for the separation of macromolecules 
and macromolecular complexes. DNA Gel electrophoresis is usually 
performed for analytical purposes, often after amplification of 
DNA via PCR, but may be used as a preparative technique prior to 
use of other methods such as mass spectrometry, RFLP, PCR, 
cloning, DNA sequencing, or Southern blotting for further charac- 
terization [2]. 

Electrophoresis apparatus is arguably one of the most vital 
pieces of equipment in the laboratory. It consists of four main 
parts: a power supply (capable of at least 100 V and currents of up 
to 100 mA), an electrophoresis tank, a casting plate, and a well- 
forming comb. Apparatus is available from many commercial 
suppliers, but tends to be fairly expensive. The essence of electro- 
phoresis is that when DNA molecules within an agarose gel matrix 
are subjected to a steady electric field, they first orient in an end-on 
position and then migrate through the gel at rates that are 
inversely proportional to the log of the number of base pairs. 
This is because larger molecules migrate more slowly than smaller 
molecules because of their higher frictional drag and greater 
difficulty in “worming” through the pores of the gel. This rela- 
tionship only applies to linear molecules. Circular molecules, such 
as plasmids, migrate much more quickly than their molecular 
weight would imply because of their smaller apparent size with 
respect to the gel matrix [3]. 

There are limits to electrophoretic techniques. Since passing 
current through a gel causes heating, gels may melt during elec- 
trophoresis. Electrophoresis is performed in buffer solutions to 
reduce pH changes due to the electric held, which is important 
because the charge of DNA and RNA depends on pH, but run- 
ning for too long can exhaust the buffering capacity of the solu- 
tion. Eurther, different preparations of genetic material may not 
migrate consistently with each other, for morphological or other 
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Agarose gels are easily cast and handled compared to other 
matrices, because the gel setting is a physical rather than chemical 
change. Samples are also easily recovered. After the experiment is 
finished, the resulting gel can be stored in a plastic bag in a 
refrigerator. Agarose gel electrophoresis can be used for the sepa- 
ration of DNA fragments ranging from 50 bp to several mega- 
bases (millions of bases) using specialized apparatus. The distance 
between DNA bands of a given length is determined by the 
percent agarose in the gel. The migration rate also depends on 
other factors, such as the composition and ionic strength of the 
electrophoresis buffer as well as the percentage of agarose in 
the gel. The gel percentage presents the best way to control the 
resolution of agarose gel electrophoresis (see Table 5.2). The 
disadvantage of higher concentrations is the long run times 
(sometimes days). Instead high percentage agarose gels should 
be run with a pulsed field electrophoresis (PFE) or field inversion 
electrophoresis. Most agarose gels are made with between 0.7 % 
(good separation or resolution of large 5-10 kb DNA fragments) 
and 2 % (good resolution for small 0.2-1 kb fragments) agarose 
dissolved in electrophoresis buffer. Up to 3 % can be used for 
separating very tiny fragments but a vertical polyacrylamide gel is 
more appropriate in this case. Low percentage gels are very wealc 
and may break when you try to lift them. High percentage gels are 
often brittle and do not set evenly. 1 % gels are common for many 
applications [4]. Agarose gels do not have a uniform pore size, but 
are optimal for electrophoresis of proteins that are larger than 
200 IcDa [5]. 

A very common method for separating proteins by electro- 
phoresis uses a discontinuous polyacrylamide gel as a support 
medium and sodium dodecyl sulfate (SDS) to denature the proteins. 
The method is called sodium dodecyl sulfate polyacrylamide gel 
electrophoresis (SDS-PAGE). SDS (also called lauryl sulfate) is an 
anionic detergent; the negative charges on SDS destroy most of the 
complex structure of proteins and are strongly attracted toward an 
anode (positively charged electrode) in an electric field. Polyacryl- 
amide gels restrain larger molecules from migrating as fast as smaller 
molecules. Because the charge- to -mass ratio is nearly the same 
among SDS-denatured polypeptides, the final separation of proteins 
is dependent almost entirely on the differences in relative molecular 
mass of polypeptides. Protein separation by SDS-PAGE can be 
used to estimate relative molecular mass, to determine the relative 
abundance of major proteins in a sample, and to determine the 
distribution of proteins among fractions. The purity of protein 
samples can be assessed and the progress of a fractionation or 
purification procedure can be followed. Different staining methods 
can be used to detect rare proteins and to learn something about 
their biocheinical properties. 
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Many systems for protein electrophoresis have been devel- 
oped, and apparatus used for SDS-PAGE varies widely. The 
methodology used on these pages employs the Laemmli method. 
SDS-PAGE can be conducted on precast gels, saving the trouble 
and hazard of working with acrylamide. Regardless of the system, 
preparation requires casting two different layers of acrylamide 
between glass plates. The lower layer (separating or resolving, gel) 
is responsible for actually separating polypeptides by size. The 
upper layer (stacldng gel) includes the sample wells. It is designed 
to sweep up proteins in a sample between two moving boundaries 
so that they are compressed (stacked) into micrometer thin layers 
when they reach tlie separating gel. 

Polyacrylamide gel electrophoresis (PAGE) is used for separ- 
ating proteins ranging in size from 5 to 2,000 kDa due to the 
uniform pore size provided by the polyacrylamide gel. Pore size is 
controlled by controlling the concentrations of acrylamide and 
bisacrylamide powder used in creating a gel. Care must be used 
when creating this type of gel, as acrylamide is a potent neurotoxin 
in its liquid and powdered form. Traditional DNA sequencing 
techniques such as Maxam- Gilbert or Sanger methods used poly- 
acrylamide gels to separate DNA fragments differing by a single 
base-pair in length so the sequence could be read. Most modern 
DNA separation methods now use agarose gels, except for partic- 
ularly small DNA fragments. It is currently most often used in the 
field of immunology and protein analysis, often used to separate 
different proteins or isoforms of the same protein into separate 
bands. These can be transferred onto a nitrocellulose or PVDF 
membrane to be probed with antibodies and corresponding mar- 
kers, such as in a western blot. Typically resolving gels are made in 
6 %, 8 %, 10 %, 12 %, or 15 %. Stacldng gel (5 %) is poured on top 
of the resolving gel and a gel comb (which forms the wells and 
defines the lanes where proteins, sample buffer, and ladders will be 
placed) is inserted. The percentage chosen depends on the size of 
the protein that one wishes to identify or probe in the sample. The 
smaller the Icnown weight, the higher the percentage that should 
be used. Changes on the buffer system of the gel can help to 
further resolve proteins of very small sizes [6]. 

There are a number of buffers used for electrophoresis. The 
most common being for nucleic acids Tris/Acetate/EDTA (TAE) 
and Tris/Borate/EDTA (TBE). Many other buffers have been 
proposed, e.g., lithium borate, which is almost never used, based 
on Pubmed citations (LB), iso electric histidine, pK matched 
goods buffers, etc.; in most cases the purported rationale is 
lower current (less heat) and or matched ion mobilities, which 
leads to longer buffer life. Borate is problematic; borate can poly- 
merize and/or interact with cis diols such as those found in RNA. 
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TAE has the lowest buffering capacity but provides the best reso- 
lution for larger DNA. This means a lower voltage and more time, 
but a better product. LB is relatively new and is ineffective in 
resolving fragments larger than 5 kbp; however, with its low con- 
ductivity, a much higher voltage could be used (up to 35 V/cm), 
which means a shorter analysis time for routine electrophoresis. As 
low as one base pair size difference could be resolved in 3 % 
agarose gel with an extremely low conductivity medium (1 mM 
Lithium borate) [7]. 

After the electrophoresis, the molecules in the gel can be 
stained to make them visible. DNA may be visualized using ethi- 
dium bromide which, when intercalated into DNA, fluoresce 
under ultraviolet light, while protein may be visualized using silver 
stain [8] or Coomassie Brilliant Blue dye. Other methods may also 
be used to visualize the separation of the mixture’s components 
on the gel. If the molecules to be separated contain radioactivity, 
for example in DNA sequencing gel, an autoradiogram can be 
recorded of the gel. Photographs can be taken of gels, often using 
Gel Doc. The most common dye used to make DNA bands visible 
for agarose gel electrophoresis is ethidium bromide, usually abbre- 
viated as EtBr. It fluoresces under UV light when intercalated into 
the major groove of DNA (or RNA). By running DNA through 
an EtBr-treated gel and visualizing it with UV light, any band 
containing more than ~20 ng DNA becomes distinctly visible. 
EtBr is a known mutagen, and safer alternatives are available, such 
as GelRed, which binds to the minor groove. SYBR Green I is 
another dsDNA stain, produced by Invitrogen. It is more expensive, 
but 25 times more sensitive, and possibly safer than EtBr, though 
there is no data addressing its mutagenicity or toxicity in humans. 
Since EtBr stained DNA is not visible in natural light, scientists mix 
DNA with negatively charged loading buffers before adding the 
mixture to the gel. Loading buffers are useful because they are visible 
in natural light (as opposed to UV light for EtBr- stained DNA), and 
they cosediment with DNA (meaning they move at the same speed 
as DNA of a certain length). Xylene cyanol and Bromophenol blue 
are common dyes found in loading buffers; they run about the same 
speed as DNA fragments that are 5,000 bp and 300 bp in length 
respectively, but the precise position varies with percentage of the 
gel. Other less frequendy used progress markers are Cresol Red and 
Orange G which run at about 125 bp and 50 bp, respectively. After 
electrophoresis, the gel is illuminated with an ultraviolet lamp. The 
ethidium bromide fluoresces reddish-orange in the presence of 
DNA, since it has intercalated with the DNA. The gel can then be 
photographed usually with a digital or polaroid camera. Although 
the stained nucleic acid fluoresces reddish-orange, images are usu- 
ally shown in black and white. Even short exposure of nucleic acids 
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to UV light causes significant damage to the sample. UV damage to 
the sample will reduce the efficiency of subsequent manipulation of 
the sample, such as ligation and cloning. If the DNA is to be used 
after separation on the agarose gel, it is best to avoid exposure to UV 
light by using a blue light excitation source such as the XcitaBlue UV 
to blue light conversion screen from Bio-Rad or Dark Reader from 
Clare Chemicals . A blue excitable stain is required, such as one of the 
SYBR Green or GelGreen stains. Blue light is also better for visuali- 
zation since it is safer than UV (eye protection is not such a critical 
requirement) and passes through transparent plastic and glass. This 
means that the staining will be brighter even if the excitation light 
goes through glass or plastic gel platforms. 

In the case of DNA, polyacrylamide is used for separating 
fragments of less than about 500 bp. However, under appropriate 
conditions, fragments of DNA differing is length by a single base 
pair are easily resolved. In contrast to agarose, polyacrylamide gels 
are used extensively for separating and characterizing mixtures of 
proteins. 



5.2 Materials 



5.2.1. Agarose Gel 
Electrophoresis 



1. Molecular-biology grade agarose (high melting point, see 
Table 5.1). 

2. Running buffer at Ix and lOx concentrations (see Table 5.2 
for choice). 

3. Sterile distilled water. 

4. A heating plate or microwave oven. 

5. Suitable gel apparatus and power pack (see Fig. 5.1). 

6. Ethidium bromide: dissolve in water at 10 mg/ml (Ethidium 
bromide is both carcinogenic and mutagenic and therefore 
must be handled with extreme caution). 

7. An ultraviolet (UV) light transilluminator (long wave, 
365 nm will not damage the DNA very fast). 

8 . 5 X loading buffer (see Note 2 ) : Many variations exist, but this 
one is fairly standard: 50 % (v/v) glycerol, 50 mM EDTA, pH 
8.0, 0.125 % (w/v) bromophenol blue, 0.125 % (w/v) xylene 
cyanol. 

9. A size marker: a predigested DNA sample for which the 
product band sizes are known. Many such markers are com- 
mercially available. 
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Table 5.1 

Resolution of agarose gels 



Agarose % 


Mol wt range (kb) 


Comments 


0.2 


5-40 


Gel very weak; separation m 20^0 kb range improved by increase 
in ionic strength of running buffer (i.e., Loenings E); only use 
high-melting point agarose 


0.4 


5-30 


With care can use low-melting point agarose 


0.6 


3-10 


Essentially as above, but with greater mechanical strength 


0.8 


1-7 


General-purpose gel separation not greatly affected by choice of 
running buffer, bromophenol blue runs at about 1 kb 


1 


0.5-5 


As for 0.8 % 


1.5 


0.3-3 


As for 0.8 %, bromophenol blue runs at about 500 bp 


2 0 


0.2-1.5 


Do not allow to cool to 50 °C before pouring 


3 0 


0.1-1 


Can separate small fragments differ mg from each other by a small 
amount; must be poured rapidly onto a prewarmed glass plate 



Table 5.2 

Commonly used agarose gel electrophoresis running buffers 



Buffer 


Description 


Solution 


Loenings E 


High tonic strength, and not recommended 
for preparative gels 


Eor 5 L of lOX 218 g of Tns base, 234 g of 
NaH 2 P 04-2 H 2 O, and 18.6 g of Na, 
EDTA-2 H 2 O 


Glycine 


Low tonic strength, very good for 

preparative gels, but can also be used for 
analytical gels 


Eor 2 L lOX: 300 g of glycine, 300 ml of 
1 MNaOH (or 12 g pellets), and 80 ml 
of0.5MEDTA, pH 8.0 


Tris-borate 

EDTA 

(TBE) 


Low ionic strength can be used for both 
preparative and analytical gels 


Eor 5 L of lOX: 545 g of Tns, 278 g boric 
acid, and 46.5 g of EDTA 


Tris-acetate 

(TAE) 


Good for analytical gels and preparative gels 
when the DNA is to be purified by glass 
beads 


Eor 1 L of 5 OX; X 242 g of Tns base, 

57.1 ml of glacial acetic acid, and 100 rnl 
of0 5MEDTA, pH 8.0 



5.2.2. SDS-PA6E 1 . Stacldng Gel Solution; 10 ml of total volume is good for 

(Poly Acrylamide Gel 2 mini gels, so measure out other components and make up 

Electrophoresis) to 10 ml final volume with distilled water. Final concentration 

of acrylamide is 4.44 %. 
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Fig. 5.1 (a) Melting of agarose and dispensing into the casting tray, (b) Loading of the 
samples 



22.2 % Acrylamide/bisacrylamide 


2 ml 


Distilled water 


6.6 ml 


1 M Tris/HCl pH 6.8 


1.25 ml 


10% SDS 


1001 


1 0 % Ammonium persulfate 


50 pi 


TEMED 


5 pi 



2. Solutions needed 

22.2 % Acrylamide/Bisacrylamide mix: 22.2 g acrylamide, 
0.6 g bis -acrylamide (37:1 cross-linker ratio) to 100 ml 
water, filtered. {Acrylamide is a potent neurotoxin and should 
be handled with care\ Wear disposable gloves when handling 
solutions of acrylamide and a mask when weighing out pow- 
der. Polyacrylamide is considered to be nontoxic, but poly- 
acrylamide gels should also be handled with gloves due to the 
possible presence of free acrylamide Table 5.3). 

44.4 % Acrylamide/Bisacrylamide mix: 44.4 g acrylamide, 

1.2 g bis -acrylamide (37:1 cross-linker ratio) to 100 ml 
water, filtered Table 5.4. 

Reservoir/running buffer: 57.6 g Glycine, 12 g Tris base, 4 g 
SDS, water to 4 1. 

Stain solution: 2.5 g Coomassie Brilliant Blue R-250, 450 ml 
methanol, 100 ml glacial acetic acid, water to 1 liter. 

Destain solution: 300 ml methanol, 400 ml acetic acid, water 
to 4 1. 

Sample buffer 5x: make up 100 ml and store away 5-10 ml 
aliquots. 
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Table 5.3 

Proportions of chemical solution in order to prepare 5-12 % Acryiamide PAGE gels 



Chemicals 


12% 


10% 


8% 


7.5 % 


6% 


5% 


22.2 % Acrylamide/0.6 % Bis 


10.81 ml 


9.01 ml 


7.21 ml 


6.76 ml 


5.41 ml 


4.5 ml 


1 M Tris/HCl pH 8.8 


7.5 ml 


7.5 ml 


7.5 ml 


7.5 ml 


7.5 ml 


7.5 ml 


Distilled water 


1.38 ml 


3.18 ml 


4.99 ml 


5.43 ml 


6.78 ml 


7.69 ml 


10 % SDS 


200 pi 


200 pi 


200 pi 


200 pi 


200 pi 


200 pi 


10 % Ammonium persulfate 


100 pi 


100 pi 


100 pi 


100 pi 


100 pi 


100 pi 


TEMED 


10 pi 


10 pi 


10 pi 


10 pi 


10 pi 


10 pi 



Table 5.4 

Proportions of chemical solution in order to prepare 13.5-27 % Acrylamide 
PAGE gels 





27% 


24% 


20% 


17.5 % 


15% 


13.5% 


44.4 % Acrylamide/ 1.2 % Bis 


12.16 ml 


10.81 ml 


9.01 ml 


7.88 ml 


6.76 ml 


6.08 ml 


1 M Tris/HCl pH 8.8 


7.5 ml 


7.5 ml 


7.5 ml 


7.5 ml 


7.5 ml 


7.5 ml 


Distilled water 


0.03 ml 


1.38 ml 


3.18 ml 


4.31 ml 


5.43 ml 


6.11 ml 


10 % SDS 


200 pi 


200 pi 


200 pi 


200 pi 


200 pi 


200 pi 


10 % Ammonium persulfate 


100 pi 


100 pi 


100 pi 


100 pi 


100 pi 


100 pi 


TEMED 


10 pi 


10 pi 


10 pi 


10 pi 


10 pi 


10 pi 



1 M Tris/HCl pH 6.8 


31.25 ml 


SDS powder 


10 g 


Glycerol 


25 ml 


Bromophenol blue (2 % in ethanol) 


750 pi 


2-Mercaptoethanol 


5 pi 


Water 


100 ml 



5.2.3. Silver Staining 
SDS PAGE Gels 



1 . Silver nitrate 

2. 1 % Citric acid: 100 ml of distilled water + 1 g of citric acid 

3. 30 % NaOH (7.5 M): 100 ml of D1 water + 30 g of NaOH 

4. 14.8 M Ammonium hydroxide 

5. 38 % Formaldehyde 
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5.3 Method 

5.3.1. Agarose Gel 
Electrophoresis 

5.3. 1. 1. Melting of Agar 



5.3. 1.2. Pouring the Gel 



5.3. 1.3. Loading the 
Samples 



6. Ultrapure water, use this for all steps and reagents 

7. 50 % Aqueous glutaraldehyde (optional) 

8. Glass tray or Novex Stain Ease Gel tray. If using glass, make 
sure to clean well with soap and DI water 



1. An appropriate amount of powdered agarose (Table 5.1) is 
weighed carefully into a conical flask. 

2. One-tenth of the final volume of lOx concentrated running 
buffer is added (Table 5.2), followed by distilled water to the 
final volume (i.e., 1 mloflOx buffer in 9 ml of distilled water 
to make lx of 10 ml). 

3. Cover the container with plastic wrap. Pierce a small hole in 
the plastic for ventilation. 

4. Heat the solution in the microwave oven on high power until 
it comes to a boil. Watch the solution closely; agarose foams 
up and boils over easily. 

5 . Remove the container (protect your hand with a pot holder or 
folded paper towel) and gently swirl it to resuspend any 
settled agar. 

6. Continue this process until the agar dissolves completely. 

7. Cool the agar until you can comfortably touch the flask and 
add ethidium bromide solution to give a final concentration 
of 5 pg/ml. 

8 . The gel mixture is ready to be poured into the gel apparatus 
(Fig. 5.1a). 

1 . Place tape across the ends of the gel form and place the comb 
in the form. 

2. Pour cooled agar into the form. The agar should come at least 
half way up the comb teeth. 

3. Immediately rinse and fill the agar flask with hot water to 
dissolve any remaining agar. 

4. When the agar has solidified, carefully remove the comb. 

5. Remove the tape from the ends of the gel form. 

1 . Make a written record of which sample you will load in each 
well of the gel. You may find it helpful to load samples in every 
other well. 
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5.3. 1.4. Setting up the Gel 



5.3. 1.5. Running and 
Analyzing the Gel 





Fig. 5.2 (a) Setting up the gel for electrophoresis, (b) Running and analysis of the 
samples under UV transilluminator 



2. Place the gel form on a black or dark surface to help you see 
the wells in the agar. 

3. Fold filter paper circle in half and hold sideways using twee- 
zers. 

4. Dip filter paper into full-strength food color to saturate. 

5. Gently ease the filter paper into the well. 

6. Be careful not to puncture the bottoms of the wells as you 
load each sample (Fig. 5.1b). 

7. Repeat for remaining colors. 

1 . Place the gel in the electrophoresis chamber. 

2. Malce sure that the wells are closest to the negative (black) 
electrode. 

3. To prepare the buffer, add 100 ml of lOx to 900 ml deio- 
nized water or distilled water to make 1 1 of 1 x running buffer 
(the water source that works for you may depend on your 
local water quality) and swirl to dissolve. 

4. Fill with 1 X running buffer to just cover the wells. 

5 . Fill each half of the chamber, adding solution until it is close 
to the top of the gel. Gently flood the gel from the end 
opposite the wells to minimize sample diffusion. 

6. Place the lid on the chamber and connect the electrode leads 
to the power supply. 

7. Connect the black lead to the negative terminal and the red 
lead to the positive terminal (Fig. 5.2a). 

1 . Turn on the power supply and adjust the voltage to 50-100 V. 

2. The gel is usually run between 1 and 3 h, depending on the 
percentage of the gel and length. 

3. Once the dyes have moved through the gel, turn off the 
power supply, disconnect the electrode leads, and remove 
the chamber lid. 
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5.3.2. SDS-PAGE 
(Polyacrylamide Gel 
Electrophoresis) 

5.3.2. 1. Preparing SDS 
Gels 



S.3.2.2. Cassettes 



S.3.2.3. Separating Gel 
Preparation 



4 . Remove the gel from the electrophoresis chamber and analyze 
your results. Did some colors move further than others.^ Did 
some colors separate into two.> 

5. After electrophoresis, the gel is removed from the apparatus, 
and the products of the digestion can be viewed on a UV 
transilluminator (Fig. 5.2b). 

A gel of given acrylamide concentration separates proteins effec- 
tively within a characteristic range. Very large polypeptides cannot 
penetrate far into a gel and thus their corresponding bands may be 
too compressed for resolution. Polypeptides below a particular 
size are not restricted at all by the gel, and regardless of mass they 
all move at the same pace along with the tracldng dye. Gel con- 
centration (%T) should be selected so that the proteins of interest 
are resolved. 

A typical gel of 7 % acrylamide composition nicely separates 
polypeptides with molecular mass between 45 and 200 kDa. Poly- 
peptides below the cutoff of around 45 kDa do not resolve. 
A denser gel, say 14%T, usually resolves all of the smallest poly- 
peptides in a mix. Such a gel would be needed to resolve hemo- 
globin, for example. It would be useless for resolving bands much 
above 60 kDa, though. To analyze the entire profile of a fraction 
that contains heavy and light polypeptides, one should usually run 
two gels. 

There are many systems for setting up gel cassettes, A simple 
‘mini-slab’ gel system can be put together for a surprisingly little 
amount of money and does the job quite well. We use casting 
stands to prepare the mini-slab gels. Two clean plates with two 
teflon spacers make a single cassette. Stack the cassettes upright in 
the stand with the bottoms of the cassettes tight to the bottom of 
the stand, using modeling clay to seal a thick acrylic cover in place 
against the last cassette to make a water-tight chamber. Using a 
well-former (comb) as a template, mark a fill line about a centime- 
ter of the first (separating) gel solution (Fig. 5.3). 

The total volume between the plates of our gel cassettes is 10 ml, 
so if we prepare 10 ml separating gel mix per cassette we have 
more than enough. From 30 % acrylamide stock (see notes 
below), we prepare gels of composition 7-15 % acrylamide, 
depending on the range of proteins that we wish to separate. 
Our separating gel buffer stock (4x concentrated) consists of 
0.4 % SDS, 1.5 M Tris-Cl, pH 8.8. Per cassette, mix 2.5 ml buffer 
stock and sufficient acrylamide stock so that the mix is brought to 
final volume with distilled water. 

Acrylamide polymerizes spontaneously in the absence of 
oxygen, so the polymerization process involves complete removal 
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Empty casting stand, 
upright 



Top view of casting stand 
with stacked cassettes, 
cover, and clay in place. 




Casting stand with cassettes 
prior to sealing with clay 




caly + cover showing through acrylic 
(side view) 

- back 

stacked 
cassettes 

base 




Fig. 5.3 A single gel cassette and assembly 

of oxygen from the solution. Polymerization is more uniform if 
the mix is degassed to remove much of the dissolved oxygen, by 
placing it under a vacuum for 5 min or so before polymerization. 
Polymerization by adding freshly preparedlO% ammonium per- 
sulfate (AP) to the mix followed by AT,N,N’,N’’-tetramethylethy- 
lenediamine (TEMED). The amounts of each depend on the 
quality of acrylamide used and should be determined in advance 
by trial and error. Usually 100 pi AP and 10 pi TEMED per 10 ml 
gel mix, once the catalysts are added, polymerization may occur 
quicldy; thus it is necessary to have the casting stand completely 
ready and to have the overlay solution ready to go. After swirling, 
mix the solution into the space occupied by the cassettes. The 
cassettes will self-level eventually, but leveling can be hurried 
along by adding solution to selected cassettes with a pasteur pipet. 

Immediately after pouring the gel mix, it must be overlaid 
with water-saturated butanol to an additional height of 0.5 cm or 
so. The purpose of butanol is to produce a smooth, completely 
level surface on top of the separating gel, so that bands are straight 
and uniform. Butanol holds very little water in solution, forming a 
neat layer on top. Water would make an effective overlay but 
would mix with the acrylamide solution diluting it. Polymeriza- 
tion can be confirmed by pulling some of the remaining gel mix 
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Fig. 5.4 (a, b) Preparation of casting gei 



into the pipet, allowing it to stand, and checking it after 10 min or 
so. It should not take more than 15 minutes for any of the gel 
mixes to polymerize (Fig. 5.4). 

5.3.2. 4. Stacking Gel Ten ml of stacking gel mix is sufficient for three of our cassettes; 

Preparation however, for the sake of accuracy it may be preferable to make 20 

or 30 ml. Stacking gel buffer stock consists of 0.5 M Tris-Cl, pH 
6.8, with 0.4 % SDS. Typical stackers are 3-4.5 % acrylamide. 
Before adding the final two components, which will start poly- 
merization, the butanol should be poured off the separating gels 
into a sink with tap water running and excess butanol/acrylamide 
removed from the surfaces with a pipet. After adding AP and 
TEMED immediately swirl the mix and pour it into the cassettes 
to the tops of the plates. Insert combs one at a time, taking care 
not to catch bubbles under the teeth, and adjust to make them 
even if necessary, scraping excess staclcing mix off later. 
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5.3.2.5. Preparing Protein 
Sampies for 
Eiectrophoresis 



5.3.2.6. Sampie 
Denaturation 



A polypeptide is a macromolecule consisting of a nonbranching 
sequence of amino acids, each connected to the next by a single 
peptide bond. A protein consists of one or more polypeptides 
and/or additional types of molecules, held together by any of a 
number of molecular interactions often including covalent bonds. 
Such interactions result in several levels of organization, which 
we call primary, secondary, tertiary, and quaternary structures. 
Patterns of bands vary depending on temperature, buffer, varia- 
tions in pH, quality of a preparation, etc. To characterize a type of 
preparation and obtain predictable results, we try to take proteins 
apart so that what we have left is primary structure only. The 
amino acid sequence of a polypeptide is called its primary struc- 
ture. Interaction of soluble proteins with water leads to hydrogen 
bonding, which is partially responsible for the secondary structure 
of proteins. Secondary structure refers to the local structure of a 
polypeptide chain, including helices, pleated sheets, and turns. A 
functional protein has a three-dimensional structure resulting 
from hydrogen bonding, hydrophobic amino acids. Van der 
Waal’s forces, and disulfide bonding. Three-dimensional structure 
of a protein is called its tertiary structure. Quaternary structure 
refers to the interaction of individual polypeptide chains with 
other molecules to form ftmctional proteins. Although some pro- 
teins do consist of single polypeptides, many consist of two or 
more polypeptides linked by covalent bonds and/or noncovalent 
forces. In fact, many native (functional) proteins include non- 
protein components such as the carbohydrate groups on many 
membrane-associated proteins (Fig. 5.5a). 

Various sample buffers have been used for SDS-PAGE but all use 
the same principles to denature samples. Good denaturation by 
preparing a sample to a final concentration of 2 mg/ml protein 
with I % SDS, 10 % glycerol, 10 mM Tris-Cl, pH 6.8, I mM 
EDTA, a reducing agent such as dithiothreitol (DTT) or 2- 
mercaptoethanol, and a pinch of bromophenol blue to sOerve as 
a tracking dye (-0.05 mg/ml) (Fig. 5.5b, c). 

2x concentrate of sample buffer consisting of 2 % SDS, 20 % 
glycerol, 20 mM Tris-Cl, pH 6.8, 2 mM EDTA, 160 mM dithio- 
threitol (DTT), and O.I mg/ml bromophenol blue dye. 

What do the various components do.> 

1 . EDTA is a preservative that chelates divalent cations, which 
reduces the activity of proteolytic enzymes that require cal- 
cium and magnesium ions as cofactors. 

2. The tris acts as a buffer, which is very important since the 
stacking process in discontinuous electrophoresis requires a 
specific pH. 
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Fig. 5.5 (a) A native (functional) integrai membrane protein is embedded in the phospholipid biiayer at left, at right, the 
anionic detergent has partialiy disrupted the interaction of protein and phosphoiipids. (b) Heating a sample in the 
presence of SDS speeds up the disruption of secondary, tertiary, and quaternary structure. Dashed squares indicate foids 
caused by disulfide bonds. Since they are covalent, disulfide bonds are not affected by SDS. (c) Dithiothreitol (DTT) 
reduces disulfide bonds, removing the last traces of tertiary or quaternary structure 

3 . Glycerol makes the sample more dense than the sample buffer, 
so the sample will remain in the bottom of a well rather than 
float out. 

4. The dye allows the investigator to track the progress of the 
electrophoresis. 

5. SDS breaks up the two- and three-dimensional structure of 
the proteins by adding negative charge to the amino acids, 
immediately rendering them functionless (Fig. 5a). 

6. Heating the samples to at least 60 ° C shalces up the mole- 
cules, allowing SDS to bind in the hydrophobic regions and 
complete the denaturation (Fig. 5b). 

7. DTT is a strong reducing agent. Its specific role in sample 
denaturation is to remove the last bit of tertiary and quater- 
nary structure by reducing disulfide bonds (Fig. 5c). 



5.3. 2. 7. Amounts to load Polyacrylamide has a limited capacity for protein. Overloading 

results in precipitation and aggregation of proteins, producing 
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5.3.2.8. Assembling, 
Loading, and Running Gels 



5.3.2.9. Loading Gels 



5.3.2. 10. Running gels 



5.3.2. 1 1. Disassembly and 
Staining 



5.3.2. 12. Staining Protein 
Gels 



streaks and smears. Underloading results in complete disappoint- 
ment. The objectives of sample preparation are to put the proteins 
into a denaturing buffer, rendering them suitable for electropho- 
resis, and to adjust the concentrations of sample so that an appro- 
priate amount of protein can be loaded onto a gel. 

The best results if we load 10 pi of a 2 mg/ml final concen- 
tration of denatured protein per sample well. Dilute all samples to 
a predetermined concentration and volume before mixing with 
the denaturing buffer. 

The assembly of a gel running stand varies with the type of 
apparatus. The top of the cassette must be continuous with an 
upper buffer chamber and the bottom must be continuous with a 
lower chamber so that current will run through the gel itself. The 
cassette must be sealed in place using gaskets or a sealant such as 
agarose. Fill both the upper and lower buffer compartments with 
an electrode buffer (running buffer) consisting of 25 mM Tris, 
192 mM glycine, 0.1 % sodium dodecyl sulfate. 

Hamilton syringes work well for loading samples into the wells. 
Ideally, the glycerol in a sample causes it to sink neatly to the 
bottom of the well, allowing as much as 20 pi or even more to be 
loaded. 

The anode (+ electrode) must be connected to the bottom cham- 
ber and the cathode to the top chamber. The negatively charged 
proteins will move toward the anode, of course. Gels are usually 
run at a voltage that will run the tracking dye to the bottom as 
quickly as possible without overheating the gels. Overheating can 
distort the acrylamide or even crack the plates. The voltage to be 
used is determined empirically. Run gels at 150 V. 

When the dye front is nearly at the bottom of the gel, it is time to 
stop the run. For low percent gels with a tight dye front, the dye 
should be on the verge of running off the gel. When the percent of 
acrylamide is high the dye front may be diffuse, since the dye is not 
homogeneous. The plates are separated and the gel is dropped 
into a staining dish containing deionized water. After a quick 
rinse, the water is poured off and stain added. Staining usually 
requires incubation overnight, with agitation. 

A commonly used stain for detecting proteins in polyacrylamide 
gels is 0.1 % Coomassie Blue dye in 50 % methanol and 10 % 
glacial acetic acid. Acidified methanol precipitates the proteins. 
Staining is usually done overnight with agitation. The agitation 
circulates the dye, facilitating penetration, and helps ensure uni- 
formity of staining. 
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The dye actually penetrates the entire gel; however, it only 
sticks permanently to the proteins. Excess dye is washed out by 
‘destaining’ with acetic acid/methanol, also with agitation. It is 
most efficient to destain in two steps, starting with 50 % methanol 
and 10 % acetic acid for 1-2 h, then using 7 % methanol and 10 % 
acetic methanol to finish. The first solution shrinks the gel, 
squeezing out much of the liquid component, and the gel swells 
and clears in the second solution. Properly stained/destained gels 
should display a pattern of blue protein bands against a clear 
background. The gels can be dried down or photographed for 
later analysis and documentation. 

The original dye front, consisting of bromophenol blue dye, 
disappears during the process. In fact, bromophenol blue is a pH 
indicator which turns light yellow under acid conditions, prior to 
being washed out. In low percentage gels, sufficient protein may 
run with the dye front so that the position of the bromophenol 
blue front is permanently marked with unresolved proteins, often 
forming a continuous “front” across the bottom of the gel. In 
higher percent gels, a distinct dye front is usually not obtained. 

Coomassie blue may not stain some proteins, especially those 
with high carbohydrate content. Stains such as periodic acid-Schiff 
(PAS), fast green, or Kodak ‘Stain’s all’ may detect different 
patterns. Silver staining is generally used when detection of very 
faint proteins is necessary. 



5.3.3. Silver Staining 
SDS PAGE Gels 



1 . Malce 7 % acetic acid: 186 ml of water + 14 ml of acetic acid. 

2. Make 50 % methanol: 200 ml of water + 200 ml of methanol. 
Optional — for extra fixation/cross-linking add 240 pi of 50 % 
glutaraldehyde to the 50 % methanol (makes solution 0.03 % 
glutar aldehyde) . 

3. Soak gel in 7 % acetic acid for 7 min. 

4. Soak gel in 200 ml of 50 % methanol for 20 min for two times. 

5. Prepare Solution A: 0.8 g of silver nitrate + 4 ml of water. 

6. Rinse gel in -200 ml water for 10 min 

Note: Steps 7 and 8 are very important for the NuPAGE gels 
if you slap these steps or do not rinse the gel for long enough 
the gel will develop too quicldy and have significantly more 
background. 

7. 5 min before end of final water rinse prepare solution B: 21 ml 
of water + 250 pi of 30 % NaOH (to make 0.36 %) + 1.4 ml 
of 14.8 M ammonium hydroxide. 

8. To make staining solution: add solution A to solution B 
dropwise while stirring then add 76 ml of water. 

9. Soak gel in the staining solution for 15 min. 

10. Rinse gel in -200 ml water for 5 min. 
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11. Rinse gel in -200 ml water for 5 min. 

12. Make developing solution: 200 ml of water + 1 ml of 1 % 
citric acid + 100 pi of 37 % formaldehyde 

13. Soak gel in developing solution until bands are visible usually 
2-15 min 

14. Stop development by rinsing gel with 3 changes of -200 ml 
water 

15. The sensitivity of this method should be in the 10 ng/band 
range. 
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Chapter 6 



Molecular Identification of Microbes: 

I. Macrophomina phaseolina 

Bandatnaravuri Kishore Babu, T. Kiran Babu, and Rajan Sharma 

Abstract 

This chapter will help us in the isolation of Macrophomina phaseolina from soil and infected plants and 
examination of morphological and physiological features for identification by using microscopic and 
cultural characters. In the later part, we will learn recent research findings to identify this fungus using 
PCR-based molecular techniques. 



6.1 Introduction 



Macrophomina phaseolina (Tassi) Goid., a soil-borne fungus, 
causes charcoal rot [1]. The fungus can infect the root and lower 
stem of over 500 plant species and is widely distributed all over the 
world. The pathogen causes wide range of diseases in the arid and 
semi-arid regions of the world. M. phaseolina persists in soil as 
sclerotia formed in infected host tissue and later released in the soil 
during decaying process. As a root inhabitant, the fungus is wide- 
spread in warmer area, invades immature, damaged, or senescent 
tissues; plants are generally attacked at seedling and flowering, 
when conditions are hot and dry. Infection develops from scler- 
otia, which can survive for a few years in roots. M. phaseolina is 
widely distributed among areas with variable soil types and annual 
rainfall, indicating that this fungus can persist under highly diverse 
environmental conditions. In case of soil-borne phase, the patho- 
gen remains either on the dead organic debris or on the root 
stubbles, which are left over after the crop harvest. High soil 
temperature (40 °C), low soil pH (5.4-6), low soil moisture 
(8-16 %), and high humidity (90 %) favor infection and disease 
development. Long periods of drought and hot temperatures 
interspersed with rain showers create ideal conditions for the 
fungal pathogenesis. 
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Table 6.1 

List of PCR primers 



Primer name 


Sequence 


PCR product size 


Universal primers 


ITS-1 


5' TCCGTAGGTGAACCTGCGG 3' 


-650 


ITS-4 


5' TCCTCCGCTTATTGATATGC 3' 




Specific primers 


MPKFl 


5' CCGCCAGAGGACTATCAAAC 3' 


-350 


MPKRl 


5' CGTCCGAAGCGAGGTGTATT 3' 





6.2 Materials 



• Diseased field soil 

• Mesh-2 mm, 45 ^m 

• Sodium hypochlorite 

• Acidified Potato Dextrose agar plates (pH 5.6) 

• M. phaseolina culture 

• Potato Dextrose broth 

• Lysis buffer— (50 mM Tris-HCl, pH 7.8, 50 mM Na2- 
EDTA, 3 % SDS), 1 % 2-mercaptoethanol should be added 
freshly) 

• Primers (Table 6.1) 

• Thermal cycler 

• Taq DNA polymerase 

• 10 X PCR buffer 

• Milli Q Water 

• 50 X TAE buffer 



6.3 Methods 

6.3.1. Isolation 

1. Sieve the air- dried soil through mesh. 

2. Dissolve 5 g of soil in 0.525 % sodium hypochlorite and allow 
standing for 10-20 min. 

3. Wash the deposit in sterile distilled water over a sieve with a 
45 pm mesh. 
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6.3.2. Microscopy 



6.3.3. DNA Extraction 

6.3.4. Poiymerase 
Chain Reaction 



6.3.4. 1. Amplification of 
5.8S rDNA Gene and ITS 
Regions 

6.3.4.2. PCR Program for 
ITS Primers 



4 . Introduce the deposit into a 250-ml flask and incorporate in 
to 100 ml of PDA. 

5. Pour into the petriplates and incubate at 32-34 °C for 3-4 
days. 

6. Colony morphology: On PDA colonies range in color from 
white to brown or gray and darken with age. 

1. Hyphal branches generally form at right angles to parent 
hyphae, but branching is also common at acute angles. 

2. Pycnida: 100-200 pm in diameter; dark to grayish, becoming 
black with age; globose or flattened globose; membranous to 
subcarbonaceous with an inconspicuous or definite truncate 
ostiole. 

3. The pycnida bear simple, rod-shaped conidiophores, 
10-15 pm long. 

4. Conidia: 14-33 x 6-12 pm, single celled, hyaline, and ellip- 
tic or oval. 

5. Microsclerotia: jet black in color and appear smooth and 
round to oblong or irregular. 

See Chap. 1 (Fungal DNA isolation). 

Polymerase chain reaction (PCR) is a technique used to amplify a 
part of DNA that lies between two regions of known sequences. 
Two oligonucleotides are used as primers for a series of synthetic 
reactions that are catalyzed by DNA polymerase. These oligonu- 
cleotides specifically anneal to the target sequences on opposite 
strands and flank the region that is to be amplified. 

Prepare the PCR master mix following Table 6.2. Redistribute 
44 pi of master mix into PCR tube and add 4-6 pi genomic DNA. 



Lid heating — Enabled 


Step 1 = 95 °C for 5 min 


(Initial denaturation) 


Step 2 = 95 “C for 1 min 


(Denaturation) 


Step 3 = 50 °C for 30 s 


(Primer annealing) 


Step 4 = 72 “C for 1 min 20 s 


(Elongation) 


Step 5 = 34 Cycle 


(Repeat steps 2-3) 


Step 6 = 72 °C for 10 min 


(Final elongation) 


Hold at 4 °C for 10 min 
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Table 6.2 

Preparation of PCR master mixture for single reaction 



Reaction mixture 



Primer sets and reagents concentration 





ITS1 and ITS4 


MPKF1 and MPKR1 


Genomic DNA 


20-40 ng 


10-25 ng 


Forward primer 


50 pmol 


5 pmol 


Reverse primer 


50 pmol 


5 pmol 


dNTPs mix 


0.2 mM 


0.2 niM 


10 X PCR buffer 


5 pi 


2 pi 


Taq DNA polymerase 


1 U 


0.4 U 


Milli Q Water 


Make up to 50 pi 


Make up to 20 pi 




Fig. 6.1 PCR Amplification of rDNA gene cluster: (a) Primers ITS1 and ITS4 were used for amplification of nearly 650 bp 
fragment. Lanes 7-70 showing amplified products of M. phaseolina isolates, (b) Amplification of M. phaseolina with 
specific primers {MpKF1 and MpKR1) produced 350 bp amplicon in lanes cl and c2. Lanes 7-72 showing no amplified 
product with different test microbes. M- 1 kb molecular ladder. 



6. 3. 4. 3. PCR Program for 
M. phaseolina Specific 
Primers [1] 



Lid heating — Enabled 


Step 1 = 95 °C for 5 min 


(Initial denaturation) 


Step 2 = 95 °C for 30 s 


(Denaturation) 


Step 3 = 56 °C for 1 min 


(Primer annealing) 


Step 4 = 72 °C for 2 min 


(Elongation) 


Step 5 = 25 Cycle 


(Repeat steps 1-3) 


Step 6 = 72 °C for 10 min 


(Final elongation) 


Hold at 4 °C for 10 min 
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6.3.5. Gel 
Electrophoresis 


PCR amplified products together with marker (1 kb Fermentas, 
USA) were resolved by gel electrophoresis (4 V cm^^) on 1.4 % 
agarose gels in lx TAE buffer containing 0.5 mg mU^ Et-Br and 
visualized under UV transilluminator (Fig. 6.1). 
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Molecular Identification of Microbes: II. BaciUus 

Anil Kumar Saxena 



Abstract 

Comparison of 16S rDNA sequences from the type strains of 69 species revealed that the 5' end region was 
the hypervariant region (HV region) and highly specific for each type strain. Further the HV region is 
highly conserved among the species. Thus the sequencing of HV region is a very efficient index for the 
rapid identification or grouping of Bacillus species. A simple procedure for identification of genus Bacillus 
per se and to identify species based on sequencing of only a small fragment of 16S rRNA was performed. 
The comparative analysis of restriction site for Alul in 16S rRNA gene sequence could identify a 265 bp 
band common to at least 25 species but absent in newly created RaciZ/wt- related genera and phenotypically 
related genera. Primers were designed to amplify the hypervariant region in the 265 bp fragment and the 
sequencing of this small region was found as a useful criterion to classify species of Bacillus. 



7.1 Introduction 



The genus Bacillus is a large, heterogeneous group of Gram- 
positive, aerobic, endospore-forming, rod-shaped bacteria. Since 
endospore formation is a universal feature of these bacteria, spore 
morphology has traditionally been given considerable weightage 
in their classification and identification. The genus Bacillus was 
established by Cohn [ 1 ] . Since then it has undergone considerable 
taxonomic changes. It started with two prominent and truly 
endospore-forming species. Bacillus anthracis 20 \d Bacillus subtilis 
and their number increased to an incredible 146 in the fifth 
edition of Bergey’s Manual of Determinative Bacteriology. In 
the sixth edition it reduced to 33, in the seventh to 25, and in 
the eighth edition to 22 well-defined species and 26 species that 
received less recognition. 

With the introduction of modern taxonomic techniques such 
as numerical phenetics, DNA base composition determinations, 
and DNA reassociation experiments, DNA sequence homology 
between strains can be estimated; it became apparent that the 
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bacilli were more heterogeneous than hitherto suspected. The 
range of DNA base composition among strains is a good indicator 
of genetic diversity; it is generally agreed that species in a genus 
should not vary by more than 10-12 mol% G+C. In the case of 
Bacillus^ the range is about 33-65 % although strains of most 
species cluster between 40 % and 50 % [2]. Different studies 
have grouped the species into various clusters depending upon 
the characters used [3-5]. Studies based on comparative analysis 
of 16S rDNA gene sequences of different Bacillus species revealed 
five phylogenetically distinct clusters [3]. Further characterization 
at the genotypic and phenotypic levels of selected Bacillus species 
have led to the creation of several new genera like Alicyclobacillus^ 
Paenibacillus^ Brevibacillus^ Virgibacillus^ Geobacillus, Filobacil- 
lus, Jeot^alibacillus, Aneurinibacillus, Gracibacillus, and Marini- 
bacillus. The Bacillus species along with the species from related 
genera were identified successfully and differentiated by rRNA 
gene restriction patterns, and three distinct main genetic clusters 
at the 75 % banding pattern similarity were obtained [6]. In 
general 16S rDNA sequences are used in HaaV/wr classification as 
a framework of species delineation [7]. Today over 200 species of 
aerobic, endospore-forming bacteria (AEFB) allocated to about 
25 genera have been validly published. 



7.2 Methods 
for the 
Identification 
of Bacillus and 
6ac///{is- Derived 
Genera 



7.2.1. Phylogenetic 
Relationship Between 
Bacillus Species and 
Related Genera 
Through Comparison 
of 3 End of thews 
rDNA and 3 End 
of the 16S-23S ITS 
Nucleotide Sequences 



Ribosomal RNA sequences were being established as the most 
useful molecular chronometer to infer phylogenetic relationships 
because they are present in all organisms and changes in the 
nucleotide sequences were deemed to occur in a clocldike manner 
[8]. A comparison of 16S and 16-23S ITS nucleotide sequences 
led to the identification of conserved regions in almost 46 differ- 
ent Bacillus species. A primer pair was developed from the con- 
served region; one located about 200 nt upstream from the 3' end 
of the 16S rRNA gene, the other about 80 nt downstream from 
the 5' end of the 23S rRNA gene. It could amplify the last 200 bp 
of the I6S rRNA gene and the entire I6S-23S ITS region. The 
amplified fragments vary in length from 450 to 850 bp [5]. 
Sequencing of the amplified product could be used to establish 
the phylogenetic relationship among the Bacillus sweats. 
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The protocol is as follows: 

1. Grow the Bacillus strains in Nutrient broth at 30 °C for 
24-48 h to O.D of about 0.6. 

2. Isolate the genomic DNA from the culture following the 
protocol described in this book for Gram positive bacteria. 

3. Amplify the 3' end of 16S rDNA, the 16S-23S ITS region, 
and the 5' end of 23S rDNA with a pair of primers: L5I6SF 
(5'-TCGCTAGTAATCGCGGATCAGC-3') and L523SR 
(5'-GCATATCGGTGTTAGTCCCGTCC-3'). 

4. Perform amplification in a total volume of 50 ml containing 
about 50 ng DNA, 0.25 pM each primer, 200 pM dNTP, 
1.5 mM MgCl 2 , and 1.25 U of Taq DNA polymerase. 

5 . Perform PGR under the following conditions: 45 s at 95 °C and 
then 30 cycles of 15 s at 94 °C, 30 s at 53 °C, and 90 s at 72 °C. 

6. View the amplification products on 0.8 % agarose gels. 

7. Clone the amplified DNAs into a pCRITTOPO cloning 
vector using the TOPO TA cloning kit, following the manu- 
facturer’s instructions. 

8. Sequence the cloned fragments using an automated DNA 
sequencer. 

9. Align the sequences using the CTUSTAL W program [9] and 
construct the most parsimonious phylogenetic trees using the 
DNAPARS program of the PHYTIP package, version 3.6a2 
[10, II]. 



7.2.2. Partial 16S rDNA 
Sequence for Rapid 
Identification of 
Species in the Genus 
Bacillus 



Identification of Bacillus species based on the partial sequencing 
of I6S rDNA was described by Goto et al. [12]. Comparison of 
I6S rDNA sequences from the type strains of 69 species revealed 
that the 5' end region was the hypervariant region (HV region) 
and highly specific for each type strain. Further the HV region is 
highly conserved among the species. Thus the sequencing of HV 
region is a very efficient index for the rapid identification or 
grouping of Bacillus species. 

The protocol is as follows: 



1. Grow the Bacillus strains in Nutrient broth at 30 °C for 
24-48 h to a O.D of about 0.6. 

2. Isolate the genomic DNA from the culture following the 
protocol described in this manual for Gram positive bacteria. 

3 . Amplify the HV of 1 6S rDNA with a pair of primers : a forward 
primer: 5'-TGT AAA ACC ACG GCC AGTGCC TAA TAG 
ATG CAA GTC GAG CG-3' and a reverse primer: 5'-CAG 
GAA ACA GCT ATG ACC ACT GCT GCC TCCCGT AGG 
AGT-3'. 
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4. Perform amplification in a total volume of 50 ml containing 
about 50 ng DNA, 100 ng of each primer, 200 pM dNTP, 
1.5 mM MgCb, and 1.25 U DNA polymerase. 

5. Perform PCR under the following conditions: 5 min at 95 °C 
and then 30 cycles of 45 s at 94 °C, 30 s at 53 °C, and 90 s at 
72 °C. 

6. View the amplification products on 0.8 % agarose gels. 

7. Perform the sequence analysis using the EMBL, GenBank, 
and DDBJ database and carry on BLAST search to look for 
nearest neighbor. 



7.2.3. ARDRA 
and Partial 
Sequencing 
of rRNA as an Index 
to Identify Bacillus 
species 



Several new genera like AUcyclobacillus, Pnenibacillus, Brevibacil- 
lus^ Vir£fibacillus^ Geobacillus^ Filobacillus^ Jeot£fa-libacillus., Aneur- 
inibacillus^ Gmcibacillus^ and Marinibacillus have been derived 
from the genus Bacillus. A simple procedure for identification of 
genus Bacillus per se and to identify species based on sequencing 
of only a small fragment of 16S rRNA was performed. The com- 
parative analysis of restriction site for Alul in 16S rRNA gene 
sequence could identify a 265 bp band common to at least 25 
species but absent in newly created Bacillus-cc\a.ted genera and 
phenotypically related genera. Primers were designed to amplify 
the hypervariant region in the 265 bp fragment and the sequenc- 
ing of this small region was found as a useful criterion to classify 
species of Bacillus. 

The protocol is as follows: 

1. Grow the Bacillus strains in Nutrient broth at 30 °C for 
24^8 h to a O.D of about 0.6. 

2. Isolate the genomic DNA from the culture following the 
protocol described in this book for Gram positive bacteria. 

3. Amplify 16S rDNA with 60-100 ng of pure genomic DNA 
using the forward (PA) 5'-(AGAGTTTGATCCTGGCT 
CAG)-3' and reverse (PH) 5'-(AAGGAGGTGATCCAG 
CCGCA) - 3'primers . 

4. Perform amplification reaction in a 100 pi volume by mixing 
template DNA with the polymerase reaction buffer (lx); 
80 pM (each) dATP, dCTP, dTTP, and dGTP; primers PA 
and PH (100 ng each); and 1.5 U polymerase. 

5. Perform PCR amplification with the following temperature 
profile: an initial denaturation for 30 s at 94 °C for 5 min, 30 
cycles of denaturation for 30 s at 94 °C, annealing at 52 °C for 
40 s, and extension at 72 °C for 1 min 30 s, and a final 
extension at 72 °C for 7 min. 
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Fig. 7.1 Amplification of 220 bp fragment of 16S rDNA using the primer pair 265 FI and 265R1. 

6. Run the amplified product on a 0.8 % agarose gel along with X 
HindLll molecular weight marker at a constant voltage and 
visualize the gel following staining with ethidium bromide in 
gel documentation system. 

7. Digest an aliquot of purified 16S rDNA with restriction endo 
nuclease Alul in 25 pi reaction volume by using the manu 
facturer’s recommended buffer and temperature. 

8. Run the restriction products in 2.5 % (w/v) agarose gel in TE 
buffer for 3 h. Stain the gel and view as described above. 

9. Look for the presence of 265 bp fragment. If present, the 
species belongs to Bacillus with few exceptions like B. cereus, 
B. thurin£iiensis^ B. anthracis, and B. mycoides. If the fragment 
is absent, it may be a Bacillus-dcnved genera. 

10. Perform another PCR amplification reaction to amplify 
only the 265 bp fragment using a primer pair: 265FI: 5'- 
GTGCTACAATGGACAGAACAA-3' and 256RI: 5'-GTGA 
GATGTTGGGTTAAGTC-3' using the reaction and amplifica- 
tion conditions as described above for amplification of 16 S rDNA. 
The primers will amplify a product of 220 bp (Fig. 7.1). 

11. Gel extract the fragment using commercially available 
gel extraction kit or purify the fragment using PCR purifica- 
tion kit. 

12. Sequence the product and BLAST search for nearest neighbor 
using EMBL, GenBank, and DDBJ database. 
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Chapter 8 



Molecular Identification of Microbes: III. Pseudomonas 

Bhitn Pratap Singh and RatuI Saikia 



Abstract 

The molecular identification protocol for Pseudomonas introduces PCR cycle sequencing and some 
basic bioinformatics tools. First follow a simple protocol to isolate genomic DNA from the bacteria. 
This protocol involves breaking the cells open with a series of freeze/thaw cycles and then centrifuging 
to remove cellular debris. Many techniques exist for the identification of Pseudomonas. We present a 
detailed explanation for setup of 16S rRNA amplification and 16S rRNA RFLP. First after isolation 
of DNA a PCR reaction has to set up to amplify a region of the 16S rRNA gene. The PCR product 
should be cleaned up by using PCR purification kit, which cleaves excess primers and inactivates free 
nucleotides. The cleaned PCR product is then used as the template for a sequencing reaction. Sequencing 
should be done by using BigDye reagents and the reactions are run in a thermocycler (PCR machine). 
The completed samples are then sent to a core facility to obtain the sequence. Finally, view the electro- 
pherograms from the sequencing reaction, then use the sequence in a BLAST search limited to a bacterial 
data base. Students can identify their unknown bacteria by examining the top-scoring sequences from the 
BLAST search results. 16S rRNA- RFLP is an important tool to understand the phylogenic similarity 
between isolates so that one can avoid the sequencing of the isolates which are similar and it leads to 
the reduction of cost. 



8.1 Introduction 



Pseudomonas is a member of the Gammaproteobacteria class of 
Eubacteria. It is a Gram-negative, free-living aerobic rod belong- 
ing to the family Pseudomonadaceae commonly found in soil and 
water. Since the revisionist taxonomy based on conserved macro - 
molecules (e.g., 16S ribosomal RNA) the family includes only 
members of the genus Pseudomonas which are cleaved into eight 
groups; the best studied species include P. aerujjinosa in its role as 
plant growth promoting P. fluorescens [1-3], plant pathogen 
P syringae [4], an opportunistic human pathogen [5]. Their 
ease of culture in vitro and availability of an increasing number 
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Identification of Pseudomonas 



Phenotypic identification 



Immunological identification 



Selective mediae 



Microscopic examination 



Biochemical tests 



Rapid tests 



Flow cytometry 



Bacteriophage typing 



Precipitation 



Agglutination 



Florescent antibodies 



ELISA 



RIA 



Western Blotting 



PCR specific locus-specific 
Amplification i.e 16S rRNA. 



Amplified rDNA restriction 
analysis (ARDRA) 




Fig. 8.1 Various methods used for the identification of Pseudomonas spp. 



of Pseudomonas strain genome sequences has made the genus an 
excellent focus for scientific research. 

There is increasing interest in the use of molecular methods by 
the application of polymerase chain reaction (PCR) technology 
for the identification of microbial community. Such methods offer 
the advantage of reducing or eliminating the need for lengthy 
culturing and difficult morphological identification procedures 
(Fig. 8.1). 

PCR is a very simple reaction in both concept and practice. 
First, template DNA that contains some region of interest is 
isolated. The template does not need to be purified, and one can 
start with a very tiny amount of template. Here, our source of 
template DNA will be the genome of your unknown bacterial 
strain. The experimenter does not need to have some prior 
sequence information about the segment to be amplified, because 
primers (for DNA replication) need to be used that are comple- 
mentary to sequence on both sides of the segment to be amplified. 
In this experiment, we will be using “universal” 16S rRNA gene 
primers. Because we are hoping to amplify genes that are similar 
but not identical from the different bacteria, primers need to be 
designed to anneal to the parts of the DNA sequence that are most 
similar in all bacteria. The actual production of DNA is carried out 
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8.2 Materials 

8.2.1. Amplification 
of IBS rRNA Gene 



8.2.2. IBS rRNA-RFLP 



by a DNA polymerase enzyme, which starts at the primers and 
uses the template to produce copies of the template sequence. 
Many copies of the template are created by repeating the reaction 
many times (i.e., many cycles). 

After creating numerous copies of the 16S rRNA genes from 
unknown bacteria, we will determine the sequence of the DNA 
using a modified “dideoxy” sequencing reaction. Once the DNA 
sequence is determined, we will compare the sequence to known 
sequences in a computer database. The 16S rRNA genes of 
our bacterial species have already been sequenced, so that it is 
likely that the sequence of the 16S rRNA gene of your unknown 
is in the database. The potential benefits of this technology 
have been especially recognized in the regulatory field, where 
timeliness and accuracy of identifications are crucial. Pseudomonas 
is one of the important genus which has great potential exists in 
it, so we need to develop certain molecular protocols to identify 
this genus. 



1. UV-VIS spectrophotometer: To measure concentration of 
DNA. 

2. Universal primers: Universal primers for the amplification of 
16S rRNA gene. The primers can also be synthesized 
commercially (see Note 1). 

3. Polymerase chain reaction (PCR): A thermal cycler to amplify 
the specific region of DNA. 

4. Gel electrophoresis Unit: Required to run the gel. 

5 . Gel documentation system: To view the gel photos (see Note 2). 

6. Sequencing: It can be done commercially by different 
institutions. 

7. Bioinformatics facility: Basic bioinformatics facility to analyze 
the sequenced data for the identification of bacteria. 

1 . In addition to 2 . 1 . 

2. Restriction endonucleases, mainly tetra-cutters like Mspl^ 
Haelll^ Alul, etc. can be procured from any chemical supply- 
ing company. 

3. Ntsys 2.02 software to analyze the phylogenetic relationship 
among the isolates. 
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8.3 Methods 

8.3.1. Polymerase 
Chain Reaction 
Mediated Locus- 
Specific Amplification 
from the Genomic DMA 
of Pseudomonas 



8.3.2. Amplification 
of 16S rRNA 



Microbial community analysis based on small-subunit rRNA 
genes as well as protein-coding gene clone libraries has become 
common practice in microbial ecology The PCR has enabled 
specific genetic loci to be routinely amplified and examined for 
differences indicative of strain variation. The specific locus is 
examined and amplified with gene-specific primers and subjected 
to RFLP analysis. The DNA fragments are separated on an agarose 
gel, and the digested patterns are visualized following ethidium 
bromide staining. 

Analysis of the 16S rRNA gene sequence is of fundamental 
importance to current prokaryote biodiversity studies and phylo- 
genetic analyses. Most 16S rRNA gene sequences are generated 
through PCR amplification of mixed template samples and as a 
consequence an increasing number of chimeric 16S rRNA 
sequence records are being deposited into the public repositories 
[6]. Locus-specific RFLP has been applied in a number of situa- 
tions. The 16S, 23S, and 16S-23S spacer regions have been used 
as targets for locus -specific RFLP. In this variation of ribotyping, 
the ribosomal DNA is amplified and subjected to digestion with 
restriction enzyme, and the DNA fragments are visualized follow- 
ing separation by gel electrophoresis. The experimental lay-out is 
described in Figs. 8.2 and 8.3. 

1 . The genomic DNA isolated from unloiown bacteria will be 
quantified spectrophotometrically by measuring the absor- 
bance at 260 nm. The amount of DNA will be estimated 
using the relationship that OD of 1.0 corresponds to 
50 pg mD^. Part of DNA samples should be diluted with 
appropriate amount of Milli-Q water to yield a worldng con- 
centration of 50 ng/ pi and stored at 4 °C. 

2. Primers: Universal primers can be used for 16S rRNA ampli- 
fication. The stock solution (100 ng/ml) of the primers need 
to be prepared by reconstituting lyophilized primers in Milli- 
Q water and stored at —20 °C. 

3. Amplification reactions can be performed in a 100 pi volume, 
containing: lx PCR Buffer (4 pi), 2 mM dNTPs (4 pi), 
100 ng of each primer (1.5 pi), 50 ng of template DNA 
(3 pi), 1 unit Taq DNA Polymerase (0.60 pi) and the volume 
was adjusted by MIlli-Q water. The PCR protocol is as fol- 
lows: initial denaturation at 94 °C for 5 min, 40 cycles of 
denaturation at 94 °C for 30 s, annealing at 55 °C for 40 s, 
and extension at 72 °C for 1 min, and final extension at 72 °C 
for 10 min (see Note 3). 
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Blast the sequences in 
the public database 




Name the unknown organism by 
seeing the results of the Blast 



Fig. 8.2 Steps Involved In the Identification of pseudomonas by amplification of 16S rRNA gene. 



4. Tris-Acetate-EDTA (TAE) buffer — 50 x stock solution. Eor 
running gels. 

• 242 g Tris base. 

• 57.1 ml glacial acetic acid. 

• lOOmlO.5 MEDTA(pH8.0). 

Make up the volume to 1 1 by adding dd H 2 O. Autoclave 
and use lx in the running tray and gel preparation 
(see Note 4). 

5. Ethidium bromide: Sock should be prepared by dissolving 
10 mg in 1 ml of dd H 2 O (see Note 5). 

6. 6x gel loading dye: Add 0.25 % bromophenol blue; 0.25 % 
xylene cyanol; 30 % Glycerol; 60 mM EDTA. The volume was 
made up to 20 ml with millipore water. 



8.3.3. IBS rRNA-RFLP 1 . Amplify the 16S rRNA gene by using gene universal primers 

as mention above along with control (some laiown Pseudomo- 
nas spp.) following Eig. 8.4. 

2. Select any one tetra-cutter restriction enzyme {Alul, Haelll, 
or Mspl). 
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Isohile int»cl and pure genomic DNA 

H 

Check Ihe concentraiioii ofONA 

a 



Dirrri by runniii(on<*(4ro«» c*l 4nci compjirv 
* with ii4iid4rd DXAoiknoHn nM>l*(uUrw»iflit 



Check by (pecti-pbotonieter m UIi 
^ 260/280 Mtlo 



Dilute the DNA to the working 
concentration of 20*50 ng/uL. 

a 




Assemble the sequences by deleting the 
repeated sequence by using bioinforniaiics tool 
likc|CAP3 

Blast Ihe complete sequeitces in the public 
database and sy* the e-valiie 

100 % e*value indicate the name of the 
particular organism 



Fig. 8.3 General set up to perform the amplification of 16S rRNA gene in detail with experimental lay out for the 
identification of Pseudomonas. 



3. Perform restriction digestion with any one enzyme -amplified 
DNA (amplified product of PCR)-100 ng, lOx enzyme spe- 
cific buffer and 1 U of enzyme; mix in an eppendorf tube and 
incubate at 37 °C for 1 h. 

4. Run the digested product for on 3 % agarose gel. 

5. Record the banding pattern among all the isolates. 

6. Construct the dendrogram by using NTsys 2.02 or any other 
available online or offline software (see Note 6). 

7. Analyze the position of unknown isolates with respect to 
control (known Pseudomonas). 

8. Send one or two representative isolates from each cluster 
(see Note 7). 



8.4 Notes 



1. The universal primers 16S rRNA. (F-AGA GTT TGA TCC 
TGG CTC AG and R-ACG GCT ACC TTC TTA CGA CTT) 
can be used for PCR amplification of 16S rDNA [5]. These 
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Fig. 8.4 Experimental lay-out to perform 16S rRNA RFLP. 



primers can be synthesized or can be procured commercially. 
Primers should be diluted to the working concentration of 
50-100 pmol/uL. 

2. Visualize the gel carefully and do the scoring as “1” for 
presence of band and “0” for absence of band comparing to 
the known molecular weight marker. 

3. If some problem persists with PCR conditions, it can be 
checked and rectified through http://www.pcronline.com/. 

4. TAE buffer should be diluted to lx of 50x. Autoclave the 
buffer before use. Always prepare the gel in lx TAE buffer 
not dd H 2 O. Prepare the EDTA separately with pH 8.0 and 
add into the buffer. 

5. Ethidium Bromide is a carcinogenic chemical, always wear 
gloves before handling EtBr. 

6. Eor dendrogram construction you can use many online 
(Dendrogram plot, hierarchical clustering, Phylip, etc.) as 
well as offline (BioEdit, NTSys, DNA star, etc.) softwares. 

7. Sequencing can be done commercially from different 
institutions. 
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Chapter 9 



Molecular Identification of Microbes: IV. Vibrio 

Hirak Ranjan Dash, Neelatn Mangwani, and Surajit Das 



Abstract 

Vibrios are ubiquitous and abundant in aquatic environment. A huge fraction of vibrios have also been 
detected in marine environments. Being of environmental origin, Vibrio species can cause the diseases like 
cholera and gastroenteritis. Thus detection of the species becomes utmost important when there is the 
need of disease diagnosis and the causative agent detection. Though there are many techniques available 
for the identification of Vibrio species manually, they are time-consuming procedures and not economical. 
Hence in this chapter, we have described the techniques of identification of Vibrio species at molecular 
level by house-keeping gene amplification and PCR fingerprinting techniques which will be economical as 
well as time saving for rapid identification to boost the diagnosis procedures. 



9.1 Introduction 



Vibrio spp. are curved, rod-shaped, gram negative bacteria found 
in bracldsh salt water, being the normal microflora of the marine 
environment. When ingested they cause gastrointestinal illness in 
human and are reported to be the most serious pathogens in fish 
and shell fish in marine aqua culture worldwide. They possess the 
unique characteristic feature of being oxidase positive, faculta- 
tively aerobic, and non-spore formers. They are motile, possessing 
single polar flagellum [1]. Most of the Vibrio species are patho- 
genic in nature, which are associated with gastroenteritis and 
septicemia include V. cholera,^ V. parahaemolyticus^ V. vulnificus^ 
V. fischeri, V harveyi [2]. Almost all people are at great risk of 
developing disease to Vibrio species whether they are healthy 
people or immune -compromised people. Healthy hosts who 
ingest large quantities of Vibrio may experience only gastroenteri- 
tis where as immune-compromised patients having histories of 
liver disease are more vulnerable to septicemia. 
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Stool sample 

Alkaline Peptone Water(pH-8!i) 

TCBSAgar 

coloni« 



Oxidase(+),String(+) 




Test for 

V.parahaemolyticus 




Fig. 9.1 Flow chart for identification of Vibrio choierae and other species of Vibrio. 

Vibrio infection accounts for 69 % of the total food-borne 
outbreaks in the globe which can become more severe if 
untreated, but when treated the rate of fatality due to Vibrio 
infection is reduced to less than 1 %. The treatment for Vibrio 
infections includes fluid and electrolyte replacements and antibi- 
otic treatments. For targeted and quick treatment of bacterial 
infection, identification of the cause of infection is required at a 
rapid rate. Though various culture-based identification proce- 
dures are available presently for Vibrio species, they are time 
consuming and not so accurate. Molecular procedures involving 
identification of Vibrio species are quite rapid and accurate which 
are more efficient in terms of identification and feasibility. 
Currently molecular methods of identification are often used in 
addition to or instead of biochemical techniques. Molecular meth- 
ods involve examining the DNA of the bacterium, either by using 
a technique to map certain important characteristics of an organ- 
ism’s genome or by sequencing a portion of the organisms DNA. 
Results are then compared to a database of Icnown bacteria, result- 
ing in a match that allows identification. DNA sequencing has 
become so standard and straightforward that it is now often easier 
and quicker than traditional biochemical methods. The currently 
available molecular basis of identifications of Vibrio species include 
those of randomly amplified polymorphic DNA, 16S rDNA gene- 
based identification, use of oligonucleotide probes, pulse field gel 
electrophoresis (PFGE), and use of multiplex PCR. Several pro- 
tocols (Fig. 9.1) are available for the identification of Vibrio 
species, some of them have been described in this chapter. 
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9.1.1. Testing of 
Serology 



9.1.2. Multilocus 
Sequence Analysis 



9.1.3. Pulse Field Gel 
Electrophoresis 



9.1.4. Multiplex 
Polymerase Chain 
Reaction 



The primary basis of classification of strains of Vibrio species is a 
serotyping scheme, which depends on the antigenic properties of 
the somatic (O) and capsular (K) antigens. The serotyping scheme 
for Vibrio is a combination of O and K antigens and serotyping is 
done using commercially available antisera that include 1 1 differ- 
ent O antigens and 71 different K types. The serotyping scheme 
was developed using strains of clinical origin. In addition to 
serotyping, phage typing can be conducted using 46 phages 
belonging to morphological groups II, IV, and V. 

Multilocus sequence analysis is a useful tool for the identification 
of Vibrio at genetic level. In this, various house keeping genes like 
rpoA., recM, and pyrH genes are targeted [3]. After amplification 
and sequencing of these genes, the strains can be identified and 
phylogeny of these strains can be compared with the former 
polyphasic taxonomic studies like 16S rRNA based phylogeny of 
Vibrio. The data generated by this study are well suited to be used 
for the rapid detection and identification of pathogenic Vibrios in 
the environment through real-time PCR. The data could be an 
alternative to 16S rRNA gene sequences in studies of the ecology 
and community dynamics of Vibrio in environmental as well as 
clinical samples. The advantages of studying the loci mentioned 
above is that they belong to the bacterial core genome, have a high 
phylogenetic signal and are single copy genes and that different 
species of Vibrios have different gene sequences that thus enable 
the reliable identification of these organisms. 

PFGE is a useful technique for the molecular identification of Vibrio 
species. PFGE banding patterns distinguished between and identi- 
fied diversity within each of the major MEE types [4]. The group- 
ings obtained by separation of strains by PEGE patterns agree with 
the pattern obtained by ribotyping. Although the PEGE patterns of 
Vibrio choleme may be too numerous and analysis of these patterns 
may be too complex to be used in a general typing scheme, the 
variety that they offer is of particular value in investigations of 
epidemics. PFGE appears to be the most discriminating of several 
molecular subtyping methods used and it is reproducible, relatively 
stable over time, and is relatively rapid in comparison with MEE and 
methods requiring DNA hybridization. 

Currently multiplex PCR methods have been developed to char- 
acterize the Vibrio species using a single PCR for many character- 
istics. The advantages for this PCR assay is that it overcomes the 
time consuming, laborious, expensive laboratory practices of bio- 
chemical basis of identification of Vibrio and also distinguishes 
among highly similar biochemical properties of V. choleme. Vibrio 
mimicus, and other Vibrio spp., with Aeromonas. A septaplex PCR 
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protocol has been described here for the rapid identification 
of Vibrio spp. along with their virulence genes [5]. 



9.1.5. Analysis of Full 
Length toxR Gene 



The toxR gene of Vibrio species code for a transmembrane DNA- 
binding protein which activates the transcription of the cholera 
toxin operon and a gene {tcpA)^ a major subunit of pilus coloniz- 
ing factor [6]. However, the full-length sequence of toxR gene 
(i.e., 1,000 bp) from type strains of different H&rior provide an 
insight for evaluating phylogenetic relatedness and rapid species 
specific identification and differential detection of unknown 
Vibrios [7]. Because of the greater sequence variation among 
species, the use of the toxR gene becomes more effective than 
the 16S rRNA gene in distinguishing different species of Vibrio. 



9.2 Materials 



9.2.1. Multilocus 
Sequence Analysis 



9.2.2. Pulse Field 
Gel Electrophoresis 



1 . DNA extraction Icit 

2. rpoA, recA and pyrH primers 

3. dNTPs 

4. Taq DNA polymerase 

5 . PCR reaction buffer 

6. PCR tubes and tips 

1. Wash buffer (1 M NaCl, 10 mM Tris [pH 8.0], 10 mM 
EDTA) 

2. Chromosomal grade agarose (Bio-Rad, Richmond, California) 

3. Plug mold (Bio-Rad) 

4. Lysis buffer (1 M NaCl, 10 mM Tris [pH 8.0], 100 mM 
EDTA, 0.5 % Sarkosyl, 0.2 % sodium deoxycholate, 1 mg of 
lysozyme per ml, 2 pg of RNase per ml) 

5. ESP buffer (0.5 M EDTA, 1 % Sarkosyl, 1 mg of proteinase 
K per ml) 

6. TE (10 mM Tris [pH 8.0], 1 mM EDTA) 

7. 0.1 M phenylmethylsulfonyl fluoride 

8. Notl buffer (150 mM NaCl, 10 mM Tris [pH 8.0], 10 mM 
MgCb) 

9. Notl (New England Biolabs, Inc., Beverly, Mass.) 

10. East-lane agarose gel (EMC, Rockland, Maine) 

11. 0.5 X TBE (lOx TBE is 89 mM Tris base, 89 mM boric acid, 
and 2.5 mM disodium EDTA) 
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Table 9.1 

Different primers used in multiplex PCR 



Target gene 


Primer sequence (5'-3') 


Ampiicon 
size (bp) 


Gene 

accession 

no. 


0139 rfb-F 


AGCCTCTTTATTACGGGTGG 


449 


Y07786 


0139 rfb-R 


GTCAAACCCGATCGTAAAGG 


449 


Y07786 


01 rfb-F 


GTTTCACTGAACAGATGGG 


192 


X59554 


01 rfb-R 


GGTCATCTGTAAGTACAAC 


192 


X59554 


ISRrRNA 


TTAAGCSTTTTCRCTGAGAATG 


295 


AF114723 


VC-F 


ISRrRNA 


AGTCACTTAACCATACAACCCG 


295 


AF114723 


VC-R 


ctxA F 


CGGGCAGATTCTAGACCTCCTG 


564 


X00171 


ctxA R 


CGATGATCTTGGAGCATTCCCAC 


564 


X00171 


toxRF 


CCTTCGATCCCCTAAGCAATAC 


779 


M2 1249 


toxR R 


AGGGTTAGCAACGATGCGTAAG 


779 


M2 1249 


tcpA-F Clas/ 


CACGATAAGAAAACCGGTCAAGAG 


620 


X64098 


El Tor 


tcpA-R Class 


TTACCAAATGCAACGCCGAATG 


620 


X64098 


tcpA-REl 


AATCATGAGTTCAGCTTCCCGC 


823 


X74730 


Tor 


Sxt-F 


TCGGGTATCGCCCAAGGGCA 


946 


AF099172 


Sxt-R 


GCGAAGATCATGCATAGACC 


946 


AF099172 



12. Bacteriophage lambda DNA ladders (FMC) 

13. Ethidium bromide (2 pg/ml in water) 



9.2.3. Multiplex 


1. 


Luria Bertani agar media 


Polymerase Chain 


2. 


PCR buffer 


Reaction 


3. 


Magnesium chloride 




4. 


dNTPs 




5. 


Taq polymerase 




6. 


Autoclaved Milli Q water 




7. 


Primers (Table 9.1) 



Primer site 



12288-12307 

12717-12736 

13195-13213 

13368-13386 

227-248 

501-522 

588-609 

1129-1151 

277-298 

1034-1055 

3379-3402 

3977-3998 

3235-3256 

90-109 

1016-1035 
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9.2.4. Analysis of 
Full-Length toxR Gene 



1 . Genomic DNA extraction kit 

2. PCR buffer 

3. Magnesium chloride 

4. dNTPs 

5 . Taq polymerase 

6. Autoclaved MilliQ water 

7. Primers (Foreward primer toxRPV [5'-ATGACTAATAT- 
CGGCAC-3'] and Reverse primer toxS-R [5'-GCCATTCTT- 
TAGAGGTCARNAVYTGYTC- 3' ] ) 

8 . Nucleic acid purification kit 

9. BigDye® Terminator v3.1 cycle sequencing Idt 



9.3 Methods 



9.3.1. Multilocus 
Sequence Analysis 



PCR 



Reaction mixture 


Cycling conditions 






lOx buffer- 5.0 |,d 


Denaturation-94 °C 
for 5 min 






10 mM dNTPs mixture- 
2 mM each 


Denaturation-94 °C 
for 30 s 






10 (iM Forward Primer-2.5 |rl Annealing-55 °C for 

30 s 




^ 30 


10 |tM Reverse Primer-2.5 


Extension-72 °C for 
2 min 




cycles 


2.5 U/|il of Taq polymerase- 
Igl 


Extension-72 °C for 
5 min 






Water-29.5 pi 


Holding-4 °C forever 






Template DNA-2 pi 



Then the amplified PCR products were run in a gel to check 
amplification from where these products were purified using PCR 
Clean-up kit® Sigma-Aldrich and were amplified by chain termi- 
nation technique using ABI PRISM 3100 genetic analyzer 
(Applied Biosystem). Then the raw sequences were processed 
using BIOEDIT software and the phylogenetic tree was con- 
structed using MEGA 5.0. 
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9.3.2. Pulse Field Gel 
Electrophoresis 



1 . Cultures were incubated in 1 5 ml of heart infusion broth at 
37 °C for 1-1.5 h with aeration until growth reached an 
optical density of 0.6 at 610 nm. 

2. Cells (10 ml) were harvested by centrifugation and were- 
washed with 10 ml of wash buffer (1 M NaCl, 10 mM Tris 
[pH 8.0], 10 mM EDTA). 

3. Cells were resuspended in 1 ml of wash buffer and were 
warmed at 37 °C for a few minutes. Bacterial suspensions 
were mixed with an equal volume of 1 % chromosomal grade 
agarose (Bio-Rad, Richmond, Calif.) and were dispensed into 
a plug mold (Bio-Rad). 

4. Agarose plugs were allowed to solidify on ice for 10 min. 
Plugs were placed in clean tubes containing 3 ml of lysis buffer 
(1 M NaCl, 10 mM Tris [pH 8.0], 100 mM EDTA, 0.5 % 
Sarkosyl, 0.2 % sodium deoxycholate, 1 mg of lysozyme per 
ml, 2 pg of RNase per ml). 

5. Bacteria were lysed in the agarose plugs for 1 h at 37 °C. The 
lysis buffer was removed, and the plugs were incubated over 
night in 3 ml of ESP buffer (0.5 M EDTA, 1 % Sarkosyl, 1 mg 
of proteinase K per ml) at 50 °C. 

6. The next day the plugs were rinsed briefly with deionized 
water. Plugs were washed twice in 2 ml of TE (10 mM Tris 
[pH 8.0], 1 mM EDTA) containing 30 pi of 0.1 M phenyl 
methylsulfonyl fluoride for 30 min each. Plugs were washed 
four times in 3 ml of TE for 30 min each time. 

7. If the plugs were not to be used immediately, only two washes 
in TE were performed; this was followed by overnight incu- 
bation in 5 ml of TE at 4 °C. A small portion of the plug 
(2x7 mm) was sliced off and was incubated for 1 h in a 
microcentrifiige tube in 1 ml of Notl buffer (150 mM NaCl, 
10 mM Tris [pH 8.0], 10 mM MgCls). 

8. The buffer was then replaced with 125 R1 of fresh buffer 
containing 20 U of Notl (New England Biolabs, Inc., Bev 
erly. Mass.), and the mixture was incubated for 4 h at 37 °C. 

9. Restriction fragments were separated in a 1 % fast-lane aga 
rose gel (EMC, Rockland, Maine) in 0.5 x TBE (lOx TBE is 
89 mMTris base, 89 mM boric acid, and 2.5 mM disodium 
EDTA) by using a CHEF-DR 11 system (Bio-Rad). 

10. Bacteriophage lambda DNA ladders (EMC) were used as 
molecular mass standards. 

11. A model 1,000 mini chiller (Bio-Rad) was used to maintain 
the temperature of the buffer at 14 °C. A ramp time of 5-50 s 
for 20 h at 200 V was used to maximize the separation of 
larger fragments. For longer gels, the run time was increased 
to 22 h. 
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9.3.3. Multiplex 
Polymerase Chain 
Reaction 



12. If separation of smaller fragments was necessary, a ramp time 
of 1-10 s for 12 h at 200 V was used. Following electropho- 
resis, gels were stained for 20 min with ethidium bromide 
(2 pg/ml in water), destained, and visualized on a UV light 
box. 

13. V. choleme 01 isolates were separated into patterns on the 
basis of differences in band arrangements. 

14. Differences in the presence, absence, or intensity of a band 
among strains were given equal weights. 

15. Strains that differed by one band were assigned different 
pattern numbers. Pattern numbers were designated solely 
for discussion purposes and are not meant to imply related- 
ness between isolates or to fulfill a typing scheme. 

1. Grow Vibrio strains in LB broth overnight at 37 °C. 

2. Boil them for 10 min in water bath and store at —20 °C till 
further use which will be used as template. 

3. Perform Multiplex PCR using the following protocol; 



PCR 



Reaction mixture 


Cycling conditions 






lOx buffer- 10 ^1 


Magnesium chloride- 10 gl 


Denaturation-94 °C 
for 5 min 






10 mM dNTPs mixture- 
10 |.d 


Denaturation-94 °C for 
30 s 






1 0 |iM forward primer- 
4.0 pi 


Annealing-55 °C for 30 s 




^30 


10 pM reverse primer-4.0 pi 


Extension -72 °C for 
2 min 


J 


cycles 


2.5 U/pl of Taq 
polymerase- 1 pi 


Extension-72 °C for 
5 min 






Water- up to 97 pi 


Holding-4 °C forever 






Template DNA-3 pi 



4. Separate the PCR product in 2.5 %, w/v of agarose by elec- 
trophoresis and stain the gel in ethidium bromide and visua- 
lize the banding pattern under UV. 

5. The banding pattern will be observed as the Fig. 9.2. 
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9.3.4. Analysis of Full- 
Length toxR Gene 




Fig. 9.2 Banding pattern of the Vibrio isolates subjected to septaplex PCR assay. 

1. Extract the genomic DNA of the bacterial strain from the 
overnight grown culture using DNeasy® Blood and Tissue 
Kit (QIAGEN GmbH, Germany) following the manufac- 
turers protocol. 

2. Amplify the toxR gene using the genomic DNA as template as 
per the following conditions: 



PCR 



Reaction mixture 


Cycling conditions 




10 X buffer- 5.0 pi 


Denaturation-94 °C for 
2 min 




10 mM dNTPs mixture- 
0.8 mM each 


Denaturation-94 °C for 
1 min 






Forward primer- 0.5 pM 


Annealing- 5 1.4 °C for 
1 min 




^30 

cycles 


Reverse primer- 0.5 pM 


Extension-72 °C for 
2 min 






2.5 U/pl of Taq polymerase- 
Ipl 


Extension-72 °C for 
7 min 





3. Purify the amplified toxR gene using the Nucleospin® Nucleic 
Acid Purification Kit (Clontech Laboratories, Inc., USA) 
using manufacturer’s protocol. 

4. Perform DNA sequencing by 1st BASE Pte. Ltd. (Singapore) 
using the 3730XL Genetic Analyzer together with the BigDye® 
Terminator v3.1 Cycle Sequencing Kit (both from Applied Bio- 
systems, USA) and toxRPV and toxSR as sequencing primers. 

5. Perform the toxR gene homology search by using the Basic 
Local Alignment Search Tool (BLAST) (http://www.ncbi. 
nlm . nih . gov /blast) . 

6. Deduce the aminoacid sequence of toxR using the Expert 
Protein Analysis System (ExPASY) Translate tool (http:// 
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www.expasy.ch/tools/dna.html; Swiss Institute of Bioinfor- 
matics, Switzerland) and align with ToxR amino 
acid sequences from other Vibrio. 

7 . Construct the phylogenetic tree based on the sequences of toxR 
gene of otlier Vibrio species using the Molecular Evolutionary 
Genetics Analysis (MEGA), version 4.0 software. 

8. For the construction purposes employ the neighbor-]’ oining 
(NJ), p-distance method, and access the reliability of topologies 
by bootstrap method with 10,000 replicates. 
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Molecular Identification of Microbes: V. methylotrophs 

Katniesh K. Meena, Manish Kumar, and D.P. Singh 



Abstract 

Methylotrophic bacterial community is well known for the utilization of Cl and multiple reduced carbon 
substrates for their source of carbon and energy. They play an important role in the carbon cycling. The 
molecular methods are being described for the identification of cultivable and uncultivable methylo- 
trophic strains. At the level of gene coding the methylotrophic function and discrimination among the 
strains, the tools and techniques are illustrated in details. Molecular identification of the isolates enables us 
to access the phylogeny and functional diversity of the methylotrophic community. 



10.1 Introduction 



Since microorganisms were first isolated and grown in pure cul- 
ture, microbiological laboratories have needed to characterize 
isolates so that they can be differentiated from one another. The 
identification of a microbial isolate up to species level only 
amounts to a partial characterization of the isolate, but is still a 
very useful piece of information. Knowing the species allows the 
laboratory access to the body of knowledge that exists about that 
species. Schemes that can be used to describe the characteristics of 
a microbial isolate are essential in every branch of microbiology 
and their development and refinement have been constant. The 
advent of molecular biology in the 1980s contributed a set of 
powerful new tools that have helped microbiologists to detect the 
smallest variations within microbial species and even within indi- 
vidual strains. This has added an entirely new dimension to a 
science that was in danger of becoming constrained by its reliance 
on traditional laboratory techniques. In fact, the technology has 
progressed far beyond the level needed by most routine labora- 
tories, where identifying the species of any isolate is likely to be 
sufficient. Methylotrophic bacteria are a diverse group of microbes 
that utilize one-carbon compounds more reduced than CO 2 as 
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10.1.1. Traditional 
Identification Method 



sole energy sources and assimilate carbon at the oxidation level of 
formaldehyde [1,2]. The ability to grow methylotrophically was 
first discovered in the early 1900s, it was not until the 1960s to 
1970s that an understanding of the biochemical nature of this 
capability started to emerge. Bacteria able to grow on methane are 
a subset of the methylotrophs called methanotrophs [3]. These 
subpopulations of bacteria grow on a variety of carbon substrates 
such as methane, methanol, methylated amines, halogenated 
methanes, and methylated sulfur species. Methylobacteria are 
ubiquitous and present in diverse environments such as leaf phyT 
losphere [4], sea water [5], like deep-sea sediments [6], drinldng 
water, chlorinated environments [7], hot water effluent [8], 
hypersaline lake [9], and fresh water lake [10]. The ability to 
grow on reduced Cl compounds requires the presence of unique 
biochemical pathway, for both energy and carbon metabolism. 
A key feature of aerobic methylotrophy is the role of formaldehyde 
as a central intermediate. In most methylotrophs, the pool of 
formaldehyde generated from methylotrophic substrates is split, 
with part being oxidized to CO 2 for energy and part being assimi- 
lated into cell carbon via one of two unique pathways, the serine 
cycle or the ribulose monophosphate cycle. Bacteria oxidizing 
CO 2 and assimilating via classical Calvin-Benson-Bassham 
cycle are called “Pseudomethylotrophs” or “autotrophic methy- 
lotrophs” [11]. Here we explained the molecular identification of 
methylotrophs; Cl carbon substrate utilizers, with tools and tech- 
niques, distinguishing different strains. Nevertheless, methods 
and equipment designed to help with both species identification 
and typing are commercially available for a range of applications. 

Given the microbial nature of the disease, traditional or culture- 
based microbiology studies were carried out by a number of oral 
microbiology groups throughout the world. Over the next 
2 decades or so, it was shown that the bacterial species associated 
with these lesions were surprisingly limited, given the number of 
taxa potentially able to colonize and the large number of taxa 
associated with periodontal lesions. This reduced diversity implies 
special selective pressures operating within the root canal system. 
While the culture-based techniques have reported 4-12 taxa per 
root canal when the range of taxa isolated from root canal infec- 
tions as a group is taken into account, 20-30 genera are com- 
monly isolated; of these most commonly occurring species are 
Fusoba.cterium nucleatum^ Streptococcus species, Porphyromonas 
species, Prevotella intermedm^ Eptostreptococcus species, Actino- 
myces species and Eubacterium species. (The genus Eubucterium 
is very broad and at present undergoing significant taxonomic 
revision.) The isolation and identification of these taxa lead to 
large numbers of studies aimed at defining which taxa were 
responsible for the disease, what mechanisms they used and 
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10.1.2. Molecular 
Identification 



indeed, associating particular taxa to different aspects of root canal 
infections, e.g., pain, lesion size, etc. From early microscopy stud- 
ies it was shown that 50 % of the oral microbiota was unculturable. 
Therefore, it was very possible that unculturable taxa were present 
in root canal infections and were potentially playing a role in the 
disease initiation or progression or both. These unculturable taxa 
fall into two broad categories. The first are taxa that need nutrients 
or other essential components that conventional sampling techni- 
ques, transport conditions, or laboratory media do not provide. 
This could be sensitivity to oxygen (i.e., very strict anaerobes) or 
the absolute requirement for products provided by other taxa 
within the root canal. This taxa is therefore broadly unknown 
apart from microscopy studies ; unless distinct morphology is 
apparent there is no way of knowing what proportion of the taxa 
are represented in the culture-dependent proportion of the sam- 
ple. The second category contains those taxa that are known, and 
very often common, but for some reason cannot be cultured, i.e., 
they are in a dormant state and “non-culturable”. The term 
“viable but not culturable” (VBNC) was coined to describe this 
state. It is thought that cells will go into this state as a protection 
strategy in response to adverse environmental conditions. It is very 
possible that “adverse” conditions exist within root canals espe- 
cially nutrient deprivation and this may be another explanation for 
the limited taxa isolated for individual root canal infections. While 
microbiologists may have suspected that a number of taxa were 
present and unculturable (for whatever reason) there was very 
little that could be done other than using complex media to 
mimic the conditions present at the site of isolation or indeed 
use co-culture strategies. At the end of the day they had to be able 
to culture the taxa before they could identify or indeed character- 
ize them. 

With the advent of “molecular biology,” microbiologists had 
another avenue to pursue with respect to understanding the 
microbiology. Shortly after Kary Mullis described a polymerase 
chain reaction (PCR) technique, the flood gates opened with 
respect to what was possible in the world of microbial detection 
and identification. The application of PCR and sequencing (and 
associated database construction and searching software) revolu- 
tionized the detection and identification of bacteria. PCR is a 
technique, which uses a DNA polymerase enzyme to make a 
huge number of copies of virtually any given piece of DNA or 
gene. It facilitates a short stretch of DNA (usually fewer than 
3,000 bp) to be amplified by about a millionfold. In practical 
terms it amplifies enough specific copies to be able to carry out 
any number of other molecular biology applications, e.g., size 
determination (in bases) and its nucleotide sequence. The partic- 
ular stretch of DNA to be amplified, called the target sequence, is 
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Table 10.1 

List of primers are being used for identification of functional groups 
of methylotrophic bacteria 



Genes 


Targets 


Primer sequence (5'-3') 


pmoA 


Particulate methane mono 
oxygenase 


GGNGACTGGGACTTCTGG 
GAAGSCNGAGAAGAASGC [12] 


mmoX 


Methane mono oxygenase 


GGCTCCAAGTTCAAGGTCAG 
TGGCACTCGTAGCGCTCCGGCTCG [12] 


fael 


Formaldehyde activating 
enzyme 


GTCGGCGACGGCAAYGARGTCG 
GTAGTTGWANTYCTGGATCTT [13] 


fael 


Formaldehyde activating 
enzyme 


GCACACATCGACCTSATCATSGG 
CCAGTGRATGAAVACGCCRAC [13] 


Serine 


Serine-glyoxylate amino 

transferase of Serine pathway 


ATGGC(AGCT) ATGAA(CT)AT(ACT)CC(AGCT) 
ATGGATA(AGCT)GG(AG)AA(AG)AA(AGCT)CC-(CT) 
TC(AG)TC [14] 


HPS 


Hexulose phosphate synthase 
of RuMP pathway 


ATGAAGCT1CAGGTC(A/G/T)GC1ATC(A/T)GA- 
CC(A/G/T)GCGTGCATCTCC(A/G/T)ACGAA [15] 


Mau 


Methylamine dehydrogenase 


ARK CYT GYG ABT AYT GGG G 
GAR AYV GTG CAR TGR TAR GTC [ 1 6 ] 


mxaF 


Methanol dehydrogenase 


GCGGCACCAACTGGGGCTGGT 
GGGCAGCATGAAGGGCTCCC [9] 



identified by a specific pair of DNA primers (Table 10.1), oligo- 
nucleotides usually about 20 nucleotides in length which desig- 
nate the outer limits of the amplification product. Given that there 
are about 500 bacterial taxa present in the oral cavity, the range 
and complexity of the techniques utilized to identify this very 
diverse microbiota is bewildering. Molecular biology techniques 
have lead to new approaches for bacterial identification. The use of 
nucleotide sequence data from 16S ribosomal RNA genes (among 
others), now malces it possible not only to identify but also to infer 
phylogeny for all organisms on Earth. Phylogeny is defined as the 
evolutionary relationships within and between taxonomic levels, 
particularly the patterns of lines of descent, in a sense a family tree 
spanning 3.5 billion years. Therefore, within reason a single meth- 
odology can be used to identify any bacterial isolate from any 
environment. 

16S (small subunit) rRNA gene was selected as a candidate 
molecule for a number of reasons, viz., (a) it is present in all 
organisms and performs the same function, (b) its sequence is 
sufficiently conserved and contains regions of conserved, variable, 
and hyper variable sequence, (c) it is of sufficient size (ca. 1,500 
bases) to be relatively easily sequenced but large enough to 
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PCR amplification for functional and phylogenetic chronometer 16S rRNA gene (1.5 kb) 
and cloning (blue-white screening), positive clones were selected for further steps 




Fig. 10.1 Pictorial representation of different steps [1-3, 7, 8, 11, 12, 17] used for molecular identification of both 
cultivable and uncultivable methylotrophic bacteria 



contain sufficient information for identification and phylogenetic 
analysis. Denaturing gradient gel electrophoresis, Cloning, 
Sequencing, Pyrosequencing, and real-time PCR gives the insight 
for metagenomic sequences. Figure 10.1 describes the different 
steps used for molecular identification of both cultivable and 
uncultivable methylotrophic bacteria. For the identification of 
uncultivable methylotrophic bacteria, the community DNA has 
been extracted directly from the environmental samples (soil, 
water, and sediments) whereas the pure culture was used for 
genomic DNA extraction for molecular identification of cultivable 
methylotrophic bacteria. 
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10.2 Materials 



1. SET buffer (75 mM NaCl, 25 mM EDTA, and 20 mM Tris) 

2. Lysozyme (10 mg ml^^) 

3. 0.8 % Agarose gel 

4. Proteinase K and RNAse 

5 . Ethidium bromide for staining the gel 

6. UltraClean Soil DNA isolation kit 

7. Primers pA 5'-AGAGTTTGATCCTGGCTCAG3' {E. co/i posi- 
tion 8-27) and pH 5'-AAGGAGGTGATGCAGCCGGA3' 
{E. co/? position 1525-1544) 

8. 100 pM (each dNTP) dATP, dCTP, dTTP, and dGTP 

9. U Taq polymerase 

10. Thermal cycler 

11. Primers mxaF-1003 (5'GCGGCACCAACTGGGGCTGGT3; 
forward) and mxaR-1561 (5'GGGGAGCATGAAAGGG- 
CTCGC3; reverse) 

12. Eor the environmental DNA, GC clamp (CGC CCG CCG 
GGC GCG GGG GGC GGG GGG GGG GCA CGG GGG 
G) attached at 5' end one of the mxaF or mxaR primers 

13. Alpha Imager analysis system 

14. Restriction endonucleases Eforlll, Mspl, and EcoRl 

15. 6.5 %, w/v acrylamide; bis acrylamide (37.5:1) 

16. l.Ox TAE, pH 7.4 (0.04 M Tris-base, 0.02 M sodium- 
acetate, 1 mM EDTA) 

17. 7 M Urea 

18. 40 %, v/v formamide 

19. Quaquick PGR purification kit 

20. Fluorescent terminators 

21. Sterile milli-q purified water 

22. 2 pi MgCb (25 mM) 

23. 2 pi of SYBR Green master mix (20pmol) 

24. 70 % Ethanol 

25. Absolute ethanol (99.9 %) 

26. Water saturated phenol 
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10.3 Method 



10.3.1. Genomic 
and Community DNA 
Extraction and PCR 
Amplification of 16S 
rRNA and mxaF Gene 

by Pospiech and Neumann [18]. 

2. The integrity and concentration of purified DNA was deter- 
mined by agarose gel electrophoresis on 0.8 % agarose gel 
stained with ethidium bromide. 

3 . DNA from sediment samples were extracted using UltraClean 
Soil DNA isolation kit. The concentration of genomic DNA 
was adjusted to a final concentration of 50 ng pi for PCR 
amplification. 16S rRNA gene was partially amplified using 
primers pA 5'-AGAGTTTGATCCTGGCTCAG3' {E. coli 
position 8-27) and pH 5'-AAGGAGGTGATCGAGCG- 
GCA3' (T. co/i position 1,525-1,544) [17]. 

4. The amplification was carried out in a 100-pl volume by 
mixing 50-90 ng template DNA with the polymerase reaction 
buffer (lOx ); 100 pM (each) dATP, dCTP, dTTP, and dGTP; 
primers pA and pH (20 ng each), and 1.0 U Taq polymerase 
using following conditions: initial denaturation at 94 °C for 
1.5 min, 35 cycles consisting of 95 °C for 1 min, 55 °C for 
1 min, 72 °C for 1 min, and final extension 72 °C for 5 min on 
a thermal cycler. 

5 . The mxaP gene was used to identify or to authenticate popu- 
lations capable of methanol oxidation downstream of formal- 
dehyde. The mxaF gene in the isolates was partially amplified 
using specific primers, mxaF-1003 (5'GCGGCACCAACT- 
GGGGCTGGT3; forward) and mxaR-1561 (5'GGGCAGC- 
ATGAAAGGGCTCCG3; reverse). For the environmental 
DNA, GC clamp (CGC CCG CCG CGC GCG GCG GGC 
GGG GCG GGG GGA CGG GGG G) attached at 5' end with 
one of the primer, was used for increasing the separation of 
DNA bands in DGGE gel. 

6. The PCR condition was similar to 16S rDNA amplification. 
PCR products were separated on 1.5 % agarose gel stained 
with ethidium bromide and documented in Alpha Imager 
analysis system. 



1 . Log phase cultures from NMS broth were used for genomic 
DNA isolation. Pelleted cells from 1.5 ml media were resus- 
pended in 0.5 ml SET buffer (75 mM NaCl, 25 mM EDTA, 
and 20 mM Tris) with 10 pi of lysozyme (10 mg ml ) and 
genomic DNA was extracted following the method described 
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10.3.2. Amplified 
Ribosomal DMA 
Restriction Analysis 
(ARDRA) and RFLP 



10.3.3. Cloning 

(Electroporation 

Method) 



10.3.3.1. Preparation 
of Electro-competent 
DH10B Cells 



1. Following 16S rDNA amplification, the products were 
digested with selected restriction enzymes having different 
restriction sites. Approximately 1 pg of PCR- amplified 16S 
rDNA fragments were restricted with three different endonu- 
cleases Haelll, Mspl, and EcoBJ separately, incubated at 37 °C 
for overnight, and resolved on 2 % agarose gels. 

2. The mxaF gene PCR products were digested with Haelll and 
EcoBJ and separated through PAGE. The banding pattern 
obtained were visualized by ethidium bromide staining and 
documented in Alpha Imager documentation analysis system. 

3. Different phylotypes or operational taxonomic units were 
obtained by similarity and clustering analysis using the soft- 
ware, NTSYSpc-2.02e. Similarity among the isolates was cal- 
culated by Jaccard’s coefficient [11], and dendrogram was 
constructed using UPGMA method. 

Cloning is an important step in the molecular identification of any 
organisms. Basically cloning is a three steps process, i.e., compe- 
tent cell preparation, ligation, and transformation. Competent 
cells can be prepared by two different methods like (electropora- 
tion and chemical method). Both the processes are being 
described here under below. 

1. Pick a colony of DHIOB and inoculate into a 5 ml of LB 
broth. Grow overnight at 37 °C with shaking. 

2. Next morning, add 1 % of the O/N culture into 500 ml of LB 
medium and grow at 37 °C with shaldng until OD 600 
reaches ~ 0.7 (this takes ~ 2 h). 

3. Cool cells in cold room on ice for ~ 20 min. 

4. Spin centrifuge rotor for 5 min to precool to 4 °C. 

5. Pour cells into 2 precooled GS3 bottles, 250 ml each. Spin at 
5,000 rpm for 15 min at 4 °C. 

6. Decant supernatant and resuspend cells in 500 ml of sterile 
ice-cold water. 

7. Spin at 5,000 rpm for 15 min at 4 °C. Decant supernatant and 
resuspend cells in 250 ml of sterile ice-cold water. 

8. Centrifuge at 5,000 rpm for 15 min at 4 °C. Decant superna- 
tant and resuspend cells in 125 ml of sterile ice-cold water. 

9. Spin at 5,000 rpm for 15 min at 4 °C. Decant and resuspend 
cells in 10 ml ice-cold 10 % glycerol. Transfer cells into a 50 ml 
centrifuge tube and centrifuge on table -top centrifuge, at max 
speed for 15 min at 4 °C. Decant supernatant and resuspend 
cells in 0.5-1 ml sterile 10 % glycerol. 
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10.3.3.2. Ligation 



10.3.3.3. Transformation 



Table 10.2 

Ligation reaction mixture 



Reagents 


Quantity (pi) 


Dephosphorylated vector 


2.0 


5 X Ligation buffer 


2.5 


3-8 kb genomic DNA fragments 


12.5 


Deionized water 


11.0 


T4 DNA ligase 


2.0 


Total volume 


30.0 



10. Aliquot 100 of cells into each pre- cooled microcentrifuge 
tubes on ice and store electro-competent cells at — 70 °C until 
use. 

1. Set up the ligation reaction as follows (Table 10.2) and incu- 
bate the ligation mixture at 22 °C overnight. 

2. Inactivate the ligation by heating at 70 °C for 10 min and use 
the ligation mixture for the electrotransformation of E. coli 
DHIOB cells. 

1. Prior to electroporation, ligation mixture must be precipi- 
tated with ethanol or diluted to prevent the samples from 
causing an arc to jump across the cuvette upon application 
of the pulse. 

2. Thaw an aliquot of DHIOB cells on ice. When cells are 
thawed, add 1-10 pi of ligation mixture to the cells and mix 
by tapping gently. 

3. Carefully pipette the cell/DNA mixture into a chilled 0.1 cm 
cuvette. Gently tap the cuvette to ensure that the cell/DNA 
mixture makes contact all the way across the bottom of the 
cuvette chamber. Avoid formation of bubbles. 

4. Wipe the outside of the cuvette with a tissue to dry it, place it 
in the electroporation chamber, and apply pulse. For BioRad 
GenePulser® 11 electroporator, the recommended pulse con- 
ditions are 2.0 kV, 200 Q, and 25 pF. 

5. Immediately after pulsing, add 900 pi of SOC medium and 
transfer the solution to a microcentrifiige tube. Delaying this 
transfer can seriously reduce the survival of transformed cells. 

6. Incubate at 225 rpm (37 °C) for 1 h with shaldng. 
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7. Spread the cells on LB agar plates containing ampicillin 
(100 pg mL^), X-gal (20 pg ml^^), and IPTG (40 pg mG^). 

8. Incubate plates overnight at 37 °C, to permit the color to 
develop sufficiently to distinguish blue colonies from white. 



10.3.4. DGGE 
(Denaturing Gradient 
Gel Electrophoresis) 
Profiling 



1 . The mxaF PCR products were purified using SV-PCR purifi- 
cation Idt according to the manufacturer’s instructions. 
DGGE was performed as described by Muyzer et al. 
[19], briefly, PCR products were separated on a 1.0-mm 
thick, vertical polyacrylamide gel (6.5 %, w/v acrylamide: 
bisacrylamide (37.5:1); Bio-Rad) prepared with and electro- 
phoresed in l.OTAE, pH 7.4 (0.04 M Tris-base, 0.02 M 
sodium-acetate, 1 mM EDTA) at 60 °C and constant voltage 
of 150 V for 16 h. 

2. A denaturing gradient of 100 % denaturant corresponded to 
7 M urea plus 40 %, v/v formamide. The gels were loaded 
with 30 pi of PCR product, depending on the band intensity 
following electrophoresis on 1.5 % agarose gels. 

3. DGGE Gels were stained for 20 min in water containing 

0.5 pg mD^ ethidium bromide. Images were recorded in 
system. DNA bands migrating to the same position in the 
gel were assumed to be identical amplicons. 



10.3.5. 16S rRNA 
and mxaF Gene 
Sequencing 



1. The PCR amplified 16S rDNA products were purified with a 
Quaquick purification Idt . The DNA sequence was double 
checked by sequencing both strands using primers pA and pH 
for forward and reverse reaction, respectively. 

2. The nucleotide sequences were dideoxy cycle sequenced with 
fluorescent terminators and run in 3 1 30x1 Applied Biosystems 
ABI prism automated DNA sequencer. 

3. The DGGE bands were excised from the gel using a sterile 
scalpel and incubated in 60 pi sterile milli-q purified water for 
24 h at 4 °C. After this period, the DNA has diffused out 
of the gel and the solution can be used as the template in a 
re- amplification PCR. 

4. Re -amplification was performed using the original primers 
but modified PCR programs and run on DGGE to confirm 
its identity. Only pure bands were used for the sequencing by 
amplifying with primers without a GC clamp. 

5. PCR products for sequencing were purified and sequenced 
using ABI prism sequencer. The representatives of the den- 
drogram constructed from mxaF-RFLP (PAGE) pattern were 
also sequenced with fluorescent terminators and run in same 
DNA sequencer. 
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10.3.6. BLAST Search 
and Phylogenetic 
Analysis 



10.3.7. Quantification 
by Real Time PCR 



6 . Sequences were aligned to related sequences in the public 
databases: NCBI. Phylogenetic tree were calculated and 
drawn using Neighbor- joining algorithm in the MEGA 4.0 
software. For the tree construction, the aligned positions 
of mxaF gene sequences (both from cultures and from 
the DGGE bands) were compared with sequences from the 
database. 

1. The partial 16S rDNA sequences, mxaF gene sequences of 
isolated strains, and mxaF gene sequences from environmen- 
tal DNA were compared with those available in the databases. 

2. Identification was based on sequence similarity of >97 % 
with that of public database sequences, NCBI by BLAST 
homology. Sequence alignment and comparison was per- 
formed using the multiple sequence alignment program 
CLUSTALW2 with default parameters and the data con- 
verted to PHYLIP format. Minor modifications were done 
manually on the basis of conserved domains and columns 
containing more than 50 % gaps were removed. The phyloge- 
netic tree was constructed on the aligned datasets using 
neighbor joining (NJ) method using the program MEGA 
4.0.2. Bootstrap analysis was performed as described by on 
1,000 random samples taken from the multiple alignments. 

1. Quantification of mxaF gene was done in terms of copies 
using LightCycler software 3.5 based on “second derivative 
maximum method” in which exponential phase of amplifica- 
tion curve is linearly related to a starting concentration of 
template DNA molecules. 

2. Quantitative PCR using SYBR GreenI technology with the 
primers mxaF and mxaR was carried out amplifying five envi- 
ronmental DNA from sediment, negative control, and five 
plasmid DNA standards. Mastermix was prepared as 14 pi of 
sterile water, 2 pi MgCl 2 (25 mM), I pi of each primer 
(20 pmol), 2 pi of SYBR Green master mix (20 pmol) 
[Roche diagnostics], and 50 ng of gDNA. Amplification pro- 
gram applied :I0 min of denaturation at 95 °C, followed by 
40 cycles of four-segment amplification were accomplished 
with: 15 s at 95 °C for denaturation, 10 s at 55 °C for 
annealing, 20 s at 72 °C for elongation, and 5 s at 83 °C 
appended for a single fluorescence measurement above melt- 
ing temperature of possible primer dimers. This fourth seg- 
ment eliminates a nonspecific fluorescence signal and ensures 
accurate quantification of desired product. Subsequently, a 
melting step was performed consisting ofI0sat95°C, lOs 
at 60 °C, and slow heating with a rate of O.I °C per second up 
to 99 °C with continuous fluorescence measurement. 
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Preservation and Maintenance of Microbiai Cuitures 

Sudheer Kumar, Prem L. Kashyap, Ruchi Singh, and Alok K. Srivastava 



Abstract 

The preservation and maintenance of microbial cultures require special and careful attention, reliable 
preservation and appropriate quality control to ensure that recovered cultures perform in the same way as 
the original cultures. This requires a high degree of expertise in the maintenance and management of 
microbial cultures at ultralow temperatures, or as freeze-dried material, to secure their long-term integrity 
and relevance for future research, development, and conservation. This chapter outlines some of the 
important procedures and protocols involved in the conservation, preservation, and maintenance of 
microbial cultures. 



11.1 Introduction 



The majority of the biomass and biodiversity of life on the Earth is 
possessed by microbes. They play pivotal roles in biogeochemical 
cycles and harbor novel metabolites and genes that have industrial 
and agricultural uses [1, 2]. For these reasons, the conservation 
and preservation of microbes is the most important and high 
priority task for most microbiological laboratories engaged in 
research, teaching, or industrial application. Secondly, the avail- 
ability of quality and authentic microbial cultures are a significant 
advantage for promoting research, standardization, efficiency, and 
laboratory safety [3]. Therefore, it is important that microbial 
resources should be preserved in a physiologically and genetically 
stable state. 

The preservation of microorganisms by different methodolo- 
gies has been employed for decades [4]. The primary methods of 
culture preservation are continuous growth, drying, and freezing. 
Continuous growth methods, in which cultures are grown on 
agar, typically are used for short-term storage. Such cultures are 
stored at temperatures from 5 to 20 °C to increase the interval 
between subculturing. The methods are simple and inexpensive 
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because no specialized equipment is required. Drying is the most 
useful method of preservation for cultures that produce spores or 
other resting structures. Silica gel, glass beads, and soil are sub- 
strata commonly used in drying. Fungi have been stored success- 
fully on silica gel for up to 11 years [5]. Drying methods are 
technically simple and also do not require expensive equipments. 
Freezing methods, including cryopreservation are versatile and 
widely applicable. Most of microorganisms can be preserved, 
with cryoprotectants, in liquid nitrogen vapor or in standard 
ultra-low temperature freezers [6]. With freeze-drying or lyophi- 
lization, the microbial cultures are frozen and subsequently dried 
under vacuum. The method is highly successful with cultures that 
produce mitospores. Freeze-drying and freezing below —135 °C 
are excellent methods for permanent preservation of bacteria, 
actinomycetes, yeasts, and fungi [7]. For long-term storage, cul- 
tures are usually preserved by lyophilization or by ultra-freezing 
[3, 8, 9]. In these methods, two basic approaches are employed to 
slow down the rate of deleterious reactions in microbial culture. 
The first is to lower the temperature which decreases the rate of all 
chemical reactions. This can be done using refrigerators and liquid 
nitrogen freezers. The second option is to remove water from 
the culture, a process which can be tricky and involves sublimation 
of water using a lyophilizer [10]. Briefly, in this chapter, we 
provide an overview of some of the important procedure involved 
in the short-/long-term maintenance and conservation of micro- 
bial cultures. 


1 1 .2 Periodical 
Transfer to Fresh 
Media 


Microbial strains can be maintained by periodically preparing a 
fresh culture from the previous stock culture. The culture 
medium, the storage temperature, and the time interval at which 
the transfers are made vary with the species and must be ascer- 
tained beforehand. The temperature and the type of medium 
chosen should support a slow rather than a rapid rate of growth, 
so that the time interval between transfers can be as long as 
possible. Many of the microbes remain viable for several weeks 
or months on a medium like nutrient agar or potato dextrose agar 
(PDA). This method is generally used for maintenance of cyano- 
bacteria. The periodic transfer method has the disadvantage of 
failing to prevent changes in the characteristics of a strain due to 
the development of variants and mutants. 
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11 .3 Refrigeration 



Microbes can be preserved for a short period of time at 4 °C. The 
cultures require frequendy in active growth on agar slants or plates 
can be stored in a refrigerator and precaution has to be talcen to 
avoid contamination. Cultures should be prepared using standard 
techniques and then sealed before storing. For slants, it is recom- 
mended to use screw-capped tubes. For cultures on Petri dishes, 
the plates need to be sealed with Parafilm. Sealing the plates not 
only helps to prevent molds from sneaking into the plates, but it 
also slows the drying of agar. For short term over a week or two, 
cultures can be stored as stabs in small, flat-bottomed screw-capped 
vials. In this technique, vials are filled with a small amount of agar 
medium (e.g., 1 ml) and sterilized. Microbes (e.g., bacteria) are 
then introduced into the solidified agar with a sterile needle. The 
culture is incubated overnight and then stored at 4 °C. Cultures 
stored in stabs are more resistant to drying and contamination, but 
they will lose viability more quicldy than frozen stocks. The length 
of time a stab can remain viable is dependent upon the strain. 



11.4 Mineral Oil or 
Liquid Paraffin 
Storage 

Covering the fresh growth in media slants with sterile mineral oil 
or liquid paraffin can preserve many bacteria and fungi. It was 
applied for the first time in 1914 by Limier to keep the gonorrhea 
agent {Neisseria. ^onorrhoeae). The method’s basic idea is covering 
the well-grown culture on agar nutrient medium with sterile 
mineral oil. The most common used oil is paraffin or vaseline 
with layer thickness of 1-2 cm. The aim is to limit the oxygen 
access that reduces the microorganism’s metabolism and growth, 
as well as to restrict the cell drying during preservation. The cell 
viability in this method is high as compared to frequent transfer 
and storage at low temperature. The preservation period for bac- 
teria from the genera Azotohacter and Mycobacterium is from 7 to 
10 years, for Bacillus it is 8-12 years. 



11.4.1. Materials 



1 . High quality liquid paraffin with 0.830-0.890 specific gravity 
(Tyndalized at I2I °C for 20 min). 

2. Sterile semisolid growth medium in universal bottles (30 ml) 
kept at 30° slope. 

3. Metal segmented trays (37.5 x 17.5 cm^ having 25 x 
25 mm^ marking). 

4. Inoculating needle or loop. 
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11.4.2. Methods 



1. Inoculate at least two universal bottles for each strain to be 
maintained. 

2. Label one culture as reserve stock and the other(s) as working 
stock. 

3. Incubate at optimum growth temperature until the organism 
has reached maturity. 

4. Add sterile liquid paraffin (8-10 ml) to cover the slope to a 
maximum depth of 10 mm from top of the slant edge. While 
sterilization of liquid paraffin, it should not get moismre, if it 
absorbs the moisture it becomes turbid and whitish which leads 
to the deterioration of cultures during storage. Remove the 
moisture by gentle heating in oven so it becomes transparent. 

5. Store the oiled cultures, with the screw caps loose, in metal 
divided racks at 15-25 °C. 

6. Recovery: 

(a) Remove a portion of the working stock culture using a 
sterile needle or loop. 

(b) Drain as much oil as possible from the inoculum. 

(c) Inoculate on fresh growth medium. It is often best to 
inoculate a slope so that the adhering oil can drain and 
the organism can grow up the slope away from the oil at 
the point of inoculation. 

(d) The reserve stock culture is used only when re-preserva- 
tion becomes necessary when all the inoculum has been 
removed, when it is contaminated, or when the shelf life 
for the organism has been reached. 



11 .5 Freezing 



Freezing is a good way to store bacteria and most of fungi. 
Generally, the colder the storage temperature, longer the culture 
will remain viable. Freezers can be split into three categories: 
laboratory, ultralow, and cryogenic. However, ice crystal forma- 
tion is the major problem, when bacteria are stored at low tem- 
perature. Ice can damage cells by dehydration caused by localized 
increases in salt concentration. As water is converted to ice, 
solutes accumulate in the residual free water and this high 
concentration of solutes can denature biomolecules. To lessen 
the negative effects of freezing, glycerol is often used as a cryo- 
protectant. With bacteria, adding glycerol to final concentration 
of 15 % will help to keep cells viable under all freezing conditions. 
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11.5.1. Freezing 
Bacteria Using 
Giycerol 



11.5.1.1. Materials 



11.5.1.2. Method 



11.5.2. Freeze-Drying 
(Lyophilization) 



Bacteria can be frozen using 15 % glycerol. The process is simple 
and requires screw cap microfuge tubes and sterile glycerol. The 
glycerol is diluted to 30 % and an equal amount of glycerol and 
culture broth are mixed, dispensed into tubes, and then frozen. 

1. Centrifuge tube (2 ml) 

2. Sterile cryovial 

3. Cryotags 

4. Autoclaved distilled water 

5 . Vortex mixer 

6. Freezer 

7. Sterile pipette tips 

8. Glycerol (30 %, v/v) 

1. Prepare a solution of 30 % glycerol (v/v) by mixing 30 ml of 
glycerol with 70 ml of water. Transfer the solution to a screw 
cap glass bottle and sterilize by autoclaving at 121 °C for 
15 min. 

2. Aliquot 500 |tl of sterile 30 % glycerol into sterile 2 ml sterile 
microfuge tubes (see Note 1). 

3. Add 500 )tl of bacterial culture to the tube and mix with the 
glycerol using a vortex mixer. 

4. Label the tube with the organism name, strain, date, etc. 

5. Place the tube in the freezer and record its location. 

6. To revive stored bacteria, only a small volume of culture needs 
to be removed from the tube. The tube doesn’t need to be 
thawed. Open the tube and scrape the frozen culture with a 
sterile pipette tip. Replace the tube into the freezer immedi- 
ately. Transfer the small volume of frozen/thawed cells to an 
agar plate and streak. If the culture thaws, do not re-freeze it 
as cells are typically very sensitive to freezing and thawing. 
Discard the thawed culture appropriately. 

Freeze-drying (lyophilization) is a well-established method for 
long-term storage. It is a method of removing water, which not 
only serves as the medium for enzymatic reactions but also spon- 
taneous negative reactions such as free radical formation. Many 
bacteria and spore forming fungi can be preserved very effectively 
by freeze-drying. By freezing the cells in a medium that contains a 
lyoprotectant (usually sucrose) and then pulling the water out 
using a vacuum (sublimation), cells can be effectively preserved. 
This method is laborious and requires specialized equipment, but 
it has the advantage of generating stock cultures that are unaf- 
fected by power outages and empty liquid nitrogen tanks. 
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11.5.2.1. Selection of a 
Freeze-Drying Medium 



11.5.2.2. Cultivation of 
the Bacteria 



Furthermore, if cultures are routinely shipped to other labs, 
ffeeze-dried cultures do not require special handling. The down- 
side on freeze-drying is that not all cultures react the same way 
thus some experimentation is required to optimize the process for 
each strain. 

There are four significant considerations for freeze-drying 
microorganisms. Culturing and preparing the cells is the first 
consideration. Generally this is not different than methods for 
typically culturing bacteria. The second aspect involves suspend- 
ing the bacteria in a suitable freeze-drying medium, commonly, 
skim milk or sucrose is used. The third consideration is the freeze - 
drying process, which is extremely dependent upon the type of 
ffeeze-dryer used and the quantity of samples to be preserved. 
The final aspect deals with post-lyophilization storage. This pro- 
cess can be used to preserve bacteria, fungi, yeasts, proteins, 
nucleic acids, and any other molecules which may be degraded 
due to the presence of water. 

Preserving bacteria by lyophilization requires that the bacteria 
should be suspended in a medium that helps to maintain their 
viability through freezing, water removal, and subsequent storage. 
Common ingredients for this include mannitol, skim milk, and 
bovine serum albumin (BSA). A second component of a good 
medium is a lyoprotectant (sucrose and trehalose) which helps to 
preserve the structure of biomolecules throughout the lyophiliza- 
tion process (see Note 2). Generally, sldm milk (20 %) and sucrose 
(5-10 %) are used as basic freeze-drying solutions. 

1. Freeze-drying is best performed on actively growing cells 
which are collected and suspended in freeze-drying medium. 
Cells are usually cultured in liquid medium and then collected 
by centrifugation. Alternatively, cells can be washed off a 
recently strealced agar plate. In either case, it is best to suspend 
and freeze-dry at high cell densities (~10^/ml). Using higher 
cell concentrations ensures that the culture will retain at least 
some viable cells after prolonged storage. It is also possible to 
simply inoculate cultures into a freeze-drying medium and 
lyophilize at very low cell densities; however, the long-term 
survival of such preparations should be carefully scrutinized. 

2. Freeze-drying of bacteria should be done in glass vials or 
ampoules. Plastic should never be used as water can actually 
diffuse across many plastics over time. The type of glass vessel 
used may be dependent upon the configuration of the freeze- 
dryer. A basic freeze-drying apparatus may simply be a high 
efficiency vacuum pump connected to a cold moisture trap 
which in turn is attached directly to the sample. In such cases, 
long neck, heat sealable ampoules should be used. Once 
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11.5.2.3. Freeze-Drying 
Process 

11.5.2.3.1. Freezing 



ampoules are filled, sterile cotton or glass wool is inserted into 
the neck of the ampoule to prevent contamination of the 
sample. Dried bacteria are sealed under vacuum in the 
ampoules with a propane or acetylene flame. Flame sealing is 
labor intensive, but the most secure method of preserving the 
samples. If the freeze-dryer has a drying chamber, such as with 
shelf dryers, then samples can be lyophilized in serum vials 
and sealed with rubber stoppers (called bungs). These stop- 
pers often have a notched, or split, base that is inserted in the 
vial opening which allows water to escape during drying. Shelf 
freeze-dryers are often equipped with a stoppering mecha- 
nism that pushes stoppers into the vials effectively sealing 
the vial while under vacuum. 

3. Whether ampoules or vials are used, these should be filled to 
not more than 1/3 volume. Smaller volumes are permissible 
and speed the freeze -drying process, thus dispensing 250 pi of 
cell suspension into a 3 -ml vial creates a high surface area as 
compared to the volume, which will allow for faster freeze- 
drying process. With vials, bungs are inserted and the samples 
are ready for processing. 

A basic freeze-drying process can be divided into three stages: 
freezing, primary drying, and secondary drying. 

The freezing of microorganism can be done by putting a prepared 
ampoule into a freezer, dry ice/ethanol bath, or liquid nitrogen. 
Rapid freezing works well for preserving cell viability; however, it 
malces the removal of water more difficult. When the frozen 
culture is placed under vacuum, the water jumps from the ice 
and obviously the surface of the culture loses water first followed 
by water in the center of the sample. For water to sublime from 
the interior of the sample, small pores or channels must form so it 
can escape. Rapid freezing tends to create a solid block where 
channel formation is minimum. Consequently rapidly frozen sam- 
ples require greater drying times. Thus samples can be frozen 
more slowly by placing a rack of vials/ampoules in a ultralow 
freezer and allowing the culture to cool more slowly. Shelf dryers 
often have programmable temperature control that can be used to 
freeze cultures slowly as well. A slower rate of cooling results in 
larger ice crystal formation in the sample, which essentially creates 
the channels for water escape. Though different strains of bacteria 
may behave differently, dropping the temperature of prepared 
cells from ambient to —40 °C over 30-60 min will typically be 
effective. However, before committing to freeze-drying large 
numbers of samples, test the freezing step first. 
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1 1 .5. 2. 3. 2. Primary Drying Once the bacterial samples are frozen, the vacuum can be applied. 

Only high efficiency vacuums, i.e., pumps that can reduce the 
pressure to under 200 mtorr, will freeze-dry samples effectively. 
The key to primary drying is to raise the temperature of the sample 
so it is higher than the temperature of the cold trap. In basic 
systems, the cold trap is often a flask which is immersed in a dry 
ice ethanol bath. Ampoules connected to basic systems are initially 
cold (—70 °C which is the temperature of the dry ice batch) but 
warm as they absorb ambient heat. The heat creates sufficient 
molecular motion to allow water molecules to sublime, i.e., go 
from solid ice to gas, as long as vacuum is present. With high 
efficiency vacuums, the trick is to remove water faster than the 
sample absorbs heat. The sublimation of the water thus keeps the 
bacterial solution frozen. If the sample increases in temperature 
too rapidly, the solution will melt which negates the value of 
freeze-drying. 

Shelf freeze-dryers have a refrigerated condenser which serves 
as a cold trap. The condenser is also used to control the tempera- 
ture of the shelf. For primary drying, the shelf temperature is 
raised so that the water in the sample sublimes, but melting does 
not occur. In this arrangement, a condenser may hold a tempera- 
ture at — 50 °C while the shelf temperature is raised to — 10 °C. It 
is important that water moves from the warmer location (sample) 
to the colder location (condenser) without the sample melting. 
The use of matrix forming agents, such as bovine serum albumin 
(BSA) or mannitol, is very useful for helping to form a frozen 
sample that maintains its shape as water is removed. Without the 
additives, the sample would collapse. Additionally, the use of small 
volumes also benefits this primary drying stage in that water is 
more rapidly removed from small samples with large surfaces 
areas, such as with 0.5 ml in a 3-ml vial. Primary drying can take 
anywhere from 3 to 4 h for a small sample to overnight for a fully 
loaded shelf freeze-dryer. 

11.5.2.3.3. Secondary Freeze-drying with a basic system does not allow separation 

Drying between primary and secondary drying. As the frozen water is 

driven off from the sample, its temperature will rise to match that 
of ambient. Therefore, secondary drying is employed to force out 
residual water by increasing the temperature of the sample. In 
shelf dryers, the samples can be increased to 20 °C for several 
hours prior to stoppering. It is important not to over dry the 
bacteria as this can be detrimental. The use of higher temperatures 
is also not recommended for the same reason. Following second- 
ary drying, both vials and ampoules must be sealed. For shelf 
dryers with a stoppering mechanism, press the stoppers into the 
vials while under foil vacuum. With ampoules, an acetylene or 
propane torch is used to heat the long neck of the ampoule to 
seal it. The vacuum will help to pull the glass closed. 
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11.5.2.3.4. Post- 
lyophilization Storage 



11.5.2.4. Materials 



11.5.2.5. Method 

11.5.2.5.5. Freeze-Drying 
Using Shelf Lyophilizer 



Freeze-dried proteins can be stored at relatively warm tempera- 
tures as long as no moisture gets to the sample. This is not true for 
bacteria. Holding bacteria at temperatures above 4 °C for pro- 
longed periods of time will dramatically decrease the viability of 
the cells. Bacteria which would otherwise be stable for years if kept 
in a refrigerator can die within a week at room temperature. 
Consequently, accelerated shelf life studies where the sample is 
held at 37 °C to mimic long-term storage conditions will not work 
with lyophilized bacteria. 

For long-term storage, keep vials and ampoules at 4 °C. Peri- 
odically remove a vial/ampoule and assess the number of viable 
cells remaining. Rates of decay should be measured which will 
help to determine when the sample needs to be resuscitate and 
subsequently freeze-dried. For the most part, freeze-dried bacte- 
ria should be viable for several years. 

1. Shelf freeze-drier, with T-matic shelf temperature control. 

2. Preconstricted long-necked vials (2 ml) with butyl rubber 
bungs (heat sterilize vials at 180 °C for 2-3 h: bungs auto- 
claved at 121 °C for 15 min) and labeled with the strain 
number of the organism to be freeze-dried, batch number, 
and date of freeze-drying. 

3 . Pasteur pipettes . 

4. Sterile nonabsorbent cotton plugs. 

5. Air/gas glass blowers torch. 

6. Glass cutter in support handle. 

7. Sterile distilled water. 

8. Lyophilization medium: 

(a) 10 % sldm milk (10 g dry skim milk -i- 100 ml deionized 
water) (see Note 3). 

( b ) 1 0 % sucrose ( 1 0 g of sucrose in 1 0 0 ml deionized water ) 
(see Note 4). 

(c) Lyophilization vials/tubes (see Note 5). 



1 . Turn on the lyophilizer and start the condenser. If there is an 
external condenser using a dry ice/ethanol mixture then pre- 
pare this as well. The shelf can be set to 4 °C. 

2. Center the vials on the shelf. This placement is important so 
that the stoppering plate can evenly press on the stoppers 
following freeze-drying. 

3. Freeze the samples down to —40 °C either manually or 
through programmed controls. This step should talce 
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approximately 30-60 min and varies with respect to instru- 
ment. If the rate of freezing can be controlled, then a drop of 
1 °C/min is a practical rate. Once the samples reach tempera- 
ture, they should be visibly frozen (clear liquid turn opaque 
and skim milk appears solid). 

4. Allow the sample to sit at —40 °C for 1 h to ensure complete 
freezing. Vials at the center of a cluster may freeze more slowly 
than those on the outside. 

5. Turn on the vacuum pump. Within 10-20 min, the vacuum 
should be under 200 millitorr (mtorr). 

6. Once the vacuum is below 200 mtorr, increase the temperamre 
of the shelf for primary drying, the phase associated with water 
sublimation. The temperature of the shelf is dependent upon 
the lyophilization medium. For sucrose, keep the shelf temper 
amre at —25 °C. In any case, the greater the difference in 
temperature between the shelf and the condenser/ice trap, 
the more efficient the primary drying process will be. 

7. If melting of the samples occurs, then it might be necessary to 
empirically determine a shelf temperature. A practical means 
to do this involves placing a sample of the lyoprotective 
medium on a shelf temperature and incrementally lowering 
the temperature every 15 min. At some point the sample 
freezes. Under vacuum with a cold trap, your sample will be 
safe and will remain frozen. This is a practical method and is 
certainly not necessarily the most efficient primary drying 
temperature, but it should work well enough. 

8. Primary drying is the longest phase of the freeze drying pro 
cess. The idea is to keep the sample colder than condenser (or 
ice trap) but still sufficiendy warm so that water sublimes 
rapidly. The temperature of the shelf can be raised to above 
the melting temperature as long as the sublimation process 
removes the heat flowing into the sample sufficiently fast to 
prevent melting and sample collapse (where the matrix literally 
caves in). The time for primary drying will also depend upon 
the volume of the sample. For bacteria, 0.25-0.5 ml sample is 
required. A limited number of samples (10-20) in a shelf dryer 
can be completed in just a couple of hours. A fully loaded 
dryer with several hundred samples may take longer. Safely, a 
primary drying period which is overnight should work, but 
test this first before you attempt to freeze dry large numbers of 
vials. As a standard guide, freeze dry overnight. 

9. Samples still contain moisture following primary drying. The 
amount is debatable, but it somewhere between 2 % and 4 %. 
This moisture level needs to be reduced and that is done by 
pumping heat into the sample during the secondary drying 
phase. This phase is relatively short, lasting 1-2 h, but impor 
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tant for long-term viability. However, over drying of the 
bacteria can be detrimental as well. Once again, based on the 
idiosyncrasies of your lyophilizer and samples, the ideal time 
for secondary drying needs to be determined experimentally. 
Generally, raise the shelf temperature to 20 °C and dry for 2 h. 

10. With the vacuum in place, stopper the vials using the stopper- 
ing plate/mechanism. Release the vacuum, remove the vials, 
and further secure the rubber bungs/stoppers with foil crimp 
seals. It is best to store the vials at 4 °C in the dark. 

1 1 . Test the freeze dried bacteria for viability as compared to the 
original culture. Additionally, monitor the stability/viability 
of the freeze dried cultures by testing at periodical intervals. A 
good protocol will yield nearly 100 % viable cells. 

1. Once bacteria have been dispensed into vials/tubes, freeze in 
a — 80 °C freezer or equivalent. Flash freezing can be done in a 
dry ice/ethanol bath, but such samples tend to dry slower. 
Keep the samples frozen using dry ice, until they are 
connected to the manifold. 

2. Turn on the lyophilizer and condenser/cold trap. The mani- 
fold valves should be turned off and the vacuum turned on. 
Allow the vacuum to pull down to 200 mtorr or less. 

3. Expediendy connect a vial/tube to the manifold and open the 
valve. The vacuum will immediately start the sublimation pro- 
cess and pull heat from the sample. In turn, hook up the 
remaining samples. The vacuum will increase each time a valve 
is open, but it should begin to lower immediately. If the vacuum 
does not drop after a tube is attached, there might be a leak in 
that connection thus shut that valve and proceed to the others. 

4. Freeze drying with a manifold relies on ambient heat to drive 
the sublimation of the water. As the available water decreases, 
the temperature will gradually climb to ambient. This may 
take 2-3 h. Often frost that forms on the outside of the tubes 
will dissipate once the sample is done. 

5. Using an acetylene torch (propane will work but it takes 
longer), seal each vial or tube. Wear safety glasses to protect 
your eyes from shattering glass and gloves (such as cotton 
gardening gloves) to protect your hands from the hot glass. 

6. Sealed vials should be stored at 4 °C in the dark. 

7. Test the freeze dried bacteria for viability as compared to the 
original culture. Additionally, monitor the stability/viability 
of the freeze dried cultures by testing at periodical days inter- 
vals up to 1 year. If viability starts to decline rapidly, modifica- 
tion of the protocol is probably necessary. 
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11.6 

Cryopreservation 



Cryopreservation refers to the storage of a living organism at 
ultralow temperature ( — 196 °C) such that it can be revived and 
restored to the same living state as before it was stored. It is a very 
reliable method and is generally considered superior to other 
preservation methods. Bacteria preserved in liquid nitrogen nor- 
mally show high survival rates and good strain stability during 
long-term storage. In liquid nitrogen storage of microorganisms, 
polypropylene cryotubes, glass vials, glass capillaries, and polypro- 
pylene straws are generally used. 

Several factors can affect cell viability and stability during 
cryopreservation. During cryopreservation, dehydration of cells 
results and osmotic imbalance is created due to the changes in the 
concentration of salts and other metabolites. During the cooling 
process, rupture of the cellular membranes can also occur by the 
formation of large ice crystals. Successful preservation can be 
achieved by the use of cryoprotective agents (e.g., dimethylsulf- 
oxide and glycerol), maintaining a controlled rate of cooling 
(about I °C/min to about —30 °C), and an appropriate rewarm- 
ing protocol (rapid thawing in a 37 °C water bath which takes 
about I min for a glass ampoule and somewhat longer for a plastic 
vial). In practice, a relatively slow cooling rate can be easily 
obtained by keeping ampoules/vials in mechanical deep freezers 
for 1-2 h or in the neck of the liquid nitrogen storage unit for 
some minutes and then lowering containers into it. It is, however, 
not good practice to plunge cultures directly into liquid nitrogen, 
as the liquid nitrogen may seep into any imperfectly closed or 
sealed capillaries, ampoules, or vials containing the bacterial sus- 
pensions. On removal from storage, nitrogen (inside an ampoule) 
will virtually instantly change to the gaseous phase causing an 
explosion. For safety reasons it is thus recommended that cultures 
should be stored in the gas phase of liquid nitrogen. 

While preparing cells for cryopreservation, several factors such 
as optimal growth conditions, physiological state of the cells 
(preferably from the late logarithmic to early stationary phase of 
growth), high cells density should be considered as these can 
affect cell viability after cryopreservation. After mixing, cell sus- 
pensions should be kept for equilibration with the cryoprotective 
agent. For harvesting, liquid cultures are centrifuged. However, 
vigorous pipetting and high-speed centrifugation should be 
avoided and cells should be handled gently. Viability assays should 
be performed on all cultures before and after cryopreservation to 
assure long-term viability. To assure purity, identity of the pre- 
served cultures should be verified and after freezing cultures 
should be recharacterized to assure their stability. Safety 
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11.6.1. Materials 



11.6.2. Methods 



precautions should be observed when removing an ampoule from 
liquid nitrogen. The face shield, laboratory coat, and insulated 
gloves should be worn as protection against liquid nitrogen splash 
and exploding ampoules. The level of the liquid nitrogen in the 
containers should be checked preferably on a daily basis and 
maintained to a constant level, as any drop in liquid nitrogen 
level below a critical volume can result in damage due to the 
warming of the samples. 

• Screw-capped plastic cryovials (2 ml) 

• Screw-cap glass ampoules (10 x 30 mm) of 2 ml capacity 
(see Note 6) 

• Tiquid nitrogen storage tanks with canisters, racks, and canes 

• Hungate tubes with septa 

• Butyl rubber overflows tubes ( 5 mm) with Luer Lock adapters 
at both ends and long syringe needles (10-15 cm in length) 

• Sterile gas-tight hypodermic Luer Lock syringes 

• Cryoprotective agent (glycerol and dimethylsulfoxide) 

1. Harvest cells from late log or early stationary growth and 
preferably in active phase of growth. Scrape cells from the 
growth surface if they are anchorage to media. Centrifuge 
broth or anchorage -independent cultures to obtain a cell 
pellet, if desired. 

2. Prepare presterilized glycerol (10 %, w/v) or DMSO (5 %, 
v/v) in the concentration desired in fresh growth medium. 
Wlien mixing with a suspension of cells, prepare the cryopro- 
tective agents in twice the desired final concentration. 

3 . Centrifuge the culture for 30 min at 4,000 xg in the screw-cap 
bottles in which cultures are grown and remove the superna- 
tant anaerobically under a stream of nitrogen gas using an 
overflow butyl rubber tube (5 mm) with Luer Lock adapters 
at both ends and fitted with long syringe needles. To obtain 
sterile nitrogen gas, sterile cotton filled syringe is attached to a 
conduit connected to the N 2 gas (99.99 %) cylinder. 

4. Resuspend the pellet carefully in ice cold sterile DMSO solu- 
tion (5 %, v/v). In case of halophilic strains or cells which do 
not form a pellet, a thick bacterial suspension (in growth 
medium) is mixed in the ratio 3;1 with ice cold sterile 
DMSO (20 %, v/v). For extreme halophilic strains, optimum 
salt concentration should be maintained after mixing cell 
suspension with the DMSO. The cells are allowed to equili- 
brate with the cryoprotectant (15 min for DMSO, 30 min for 
glycerol) in an ice bath. 
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5. Filling of ampoules and freezing: While equilibrating, an ali- 
quot of 1.0-1. 5 ml of cell suspension is dispensed in to each 
plastic cryovial or glass ampoule. For anaerobes using a sterile 
gas-tight syringe (5-10 ml), the ampoules are evacuated for 
anaerobiosis to facilitate filling. About 1 ml of thick cell 
suspension (equilibrated with the DMSO) is withdrawn with 
a 1 ml sterile oxygen-free syringe (already flushed with nitro- 
gen gas) and injected into each ampoule. Immediately after 
the glass ampoules or cryotubes are clamped onto labeled 
aluminum canes, they are placed at —30 °C for about 1 h or 
for few minutes in the gas phase of liquid nitrogen. The canes 
are then placed in canisters, racks, or drawers and frozen by 
direct immersion in liquid nitrogen or preferably in the gas 
phase of liquid nitrogen. 

6. Revival of cultures: The frozen ampoule is removed from 
liquid nitrogen. For partial thawing, these are immediately 
immersed in the mini water bath at 37 °C for a few seconds. 
After thawing, the outer surface of the ampoules is dried by 
wiping and plastic vials are wiped with alcohol-soaked gauze 
prior to opening. For aerobic bacteria, the screwcap glass vials 
can be opened and flame sterilized at the neck. The thawed 
contents of the ampoule/vial are immediately transferred to 
fresh growth medium to dilute the cryoprotectant, which 
otherwise is lethal at higher temperatures. For anaerobes, 
the septum of the glass vial is flame sterilized after putting a 
drop of alcohol and with a 1 -ml oxygen-free syringe, a small 
volume (-0.05 ml) of inoculum is withdrawn and injected 
into 5-10 ml liquid growth medium. The rest of the cell 
suspension is immediately frozen again (a self-made wax 
block rack, chilled to —30 °C is used for transportation to 
the liquid nitrogen container) in liquid nitrogen for later use. 
In this way, one vial can be used for several repeated retrievals 
or inoculations. The DMSO which is often toxic during 
growth is diluted 100-200 times in the culture medium to a 
noninhibitory concentration. The inoculated growth medium 
is incubated under appropriate growth conditions. 

7. Estimation of viability counts: For aerobic bacteria, 0.5 ml of 
inocula is transferred to 4.5 ml of liquid growth medium and 
serial decimal dilutions are prepared. Plating and counting are 
done using standard methods. For the estimation of viable cell 
counts in anaerobic bacteria, 0.5 ml of inocula is transferred 
from the unfrozen (for cell counts before freezing) and from 
the thawed cell suspension (for cell counts after freezing) into 
prereduced 4.5 ml medium in screw-cap tubes and 6-8 serial 
decimal dilutions are prepared using oxygen-free syringes and 
incubation is done under appropriate conditions. Colony 
counts on agar plates can be performed in an anaerobic 
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glove box or anaerobic jars. In the case of viable colony counts 
in agar roll tubes or on plates, the number of colonies are 
counted from each dilution and average colony forming cells 
per sample are calculated. 



11.7 Safety 



Safety precautions must be considered when preserving micro- 
organisms by freeze-drying, freezing, and storing at cryogenic 

temperatures. 

1. Culture handling: When opening any microbial culture, fro- 
zen or freeze-dried cultures, take care to prevent dispersion of 
the ampoule contents. Open these preparations in a biological 
safety cabinet if possible and perform all work with hazardous 
cultures in a biological safety cabinet. There are varying 
degrees of pathogenicity among microorganisms. All labora- 
tory personnel should be aware of the hazards posed by the 
cultures they are handling. 

2. Cryogenic storage: Because ofits extremely cold temperature, 
liquid nitrogen can be hazardous if improperly used. When 
handling liquid nitrogen, take precautions to protect your 
eyes, face, and skin from exposure to the liquid. Wear protec- 
tive clothing, including a laboratory coat, gloves designed for 
handling material at cryogenic temperatures, and a face shield. 
To reduce your exposure to cryogenic temperatures, design 
inventory systems for storing frozen specimens that allow for 
easy retrieval and that minimize the time required to look for 
specimens. Prolonged exposure to cryogenic temperatures 
can lead to a loss of sensation in the hands that can only be 
recovered after warming. This loss of sensation can lead to a 
false sense of security regarding damage to tissues by the low 
temperatures. When the temperature in a liquid nitrogen unit 
becomes tolerable and working in the unit is no longer 
uncomfortable, the operator has reached a point where dam- 
age from the cryogenic temperatures is likely. When liquid 
nitrogen is used in confined and inadequately ventilated areas, 
the nitrogen can quickly displace the room air. Tiquid nitro- 
gen freezers should be located in well-ventilated areas, and 
special precautions should be taken during fill operations. In 
facilities with several liquid nitrogen freezers, an oxygen mon- 
itor should be installed to warn occupants of any deterioration 
in the air quality due to the nitrogen gas. Plastic screw-capped 
vials can present a hazard if stored directly in liquid nitrogen. 
Vials with an inadequate seal between the cap and the vial can 
fill with liquid nitrogen. Upon retrieval to warmer 
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temperatures the vials may explode violently or may spray 
liquid, potentially disseminating the contents of the vial. Like- 
wise when opening plastic vials after thawing some dissemina- 
tion of the contents may occur. Therefore, material in plastic 
ampoules should be stored in the vapor above the liquid 
nitrogen. 

3. Freeze-drying: When freeze-drying microorganisms in vials 
or ampoules without cotton plugs or other bacteriological 
filters, the microorganisms can be carried from the container 
and contaminate the freeze-drying system. Microbial contam- 
ination can be found on the outside of the vial or ampoule and 
on parts of the freeze-drying system such as the condenser. 
A system should be designed to monitor the contamination 
level, and decontamination procedures should be implemen- 
ted if necessary. Take care to properly treat freeze-dried cul- 
tures prior to disposal. To autoclave ffeeze-dried cultures, 
open the vial or ampoule to allow penetration of the steam. 
An alternative to autoclaving is to heat the preparations in a 
hot air oven at 180 °C for 4 h. 



11 .8 Notes 



1 . Usually, it is preferred to use tubes that are screw cap and have 
caps with O -rings. One option is to fill the tube 1/3 full with 
glass beads (4 mm) before sterilizing. When retrieving bacte- 
ria, remove or chip out one bead onto an agar plate and roll 
around to disperse the bacteria. This avoids thawing the entire 
stock. 

2. It is vital to keep accurate records of lyophilized bacteria as 
well as using dependable techniques for labeling individual 
samples. It is recommended to use hard copy, freeze -drying 
log which can be used to record each batch of samples pro- 
cessed. The labeling of tubes/vials is also critical and should 
be done using a durable label. A cleaver labeling technique can 
be done by placing small, sterile paper labels inside the sample 
tube, which is subsequently sealed in the tube along with the 
sample. 

3. Bacteria need a lyoprotectant which helps them survive the 
freeze-drying process. This medium can be very simple, such 
as 10 % skim milk, or complicated such as those that use 
animal sera. Good media have two main components: the 
lyoprotectant that stabilizes the cells when water is removed, 
and matrix agent that allows the entire sample to retain its 
shape during and after processing. Disaccharides such as 
sucrose and trehalose are excellent lyoprotectants. Matrix 
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forming additives, often referred to as excipients, include 
mannitol, BSA, serum, and skim milk. 

4. Sucrose is a lyoprotectant and provides good viability, espe- 
cially when compared to skim milk. However, bacteria pre- 
served in sucrose must be kept cold during lyophilization to 
prevent melting and collapse of the sample. 

5. The choice of vial or tube is very important for long-term 
survival of freeze-dried bacteria. Generally, vials/tubes used 
for lyophilizing bacteria are made of glass. Avoid usage of 
microfuge tubes for long-term storage as water can pass 
through plastic. Tubes, including ampoules, are the best con- 
tainers for long-term storage of freeze-dried bacteria. These 
are most commonly attached to a manifold that holds multi- 
ple tubes. Tubes and ampoules are frozen using a freezer or 
dry ice bath and then quickly connected to the manifold 
before they melt. After the samples are dried, the neck of the 
tube or ampoule is sealed off using a propane or acetylene 
torch. Tubes and ampoules sealed under vacuum are impervi- 
ous to moisture. The downside is that they require much 
more labor in comparison to vials. For freeze-drying in a 
shelf lyophilizer, vials are equipped with a stoppering plate. 
Vials are filled and then fitted with a stopper which has a notch 
that allows gas flow while sitting loosely in the vial opening. 
After freeze-drying, the stoppering plate is lowered and 
pushes the stoppers into the vial under vacuum. When the 
vacuum is released, atmospheric pressure secures the stopper 
and the vacuum within the vial. After removing from the 
lyophilizer, stoppers are further secured with an aluminum 
band which is crimped in place. Vials are very convenient and 
easy to use, but they can leak during long-term storage. For 
the short-term, they are very good. Although, any borosili- 
cate glass test tube or tubing can be used for freeze-drying. 
Borosilicate glass is more difficult to seal than soda-lime tubes, 
but it is more durable. In using tubes, the culture medium is 
added to a sterile tube which is then loosely plugged with 
sterile glass wool. The sample is frozen, hooked to the vac- 
uum, and processed. Once dry, a torch is used to seal the tube 
between the sample and vacuum manifold. 

6. Ampoules are the easiest container to seal with a flame due to 
their design. As above, the cell suspension is added to a sterile 
ampoule, usually with a Pasteur pipette or very narrow micro- 
pipette tip. The ampoule is then loosely plugged with sterile 
glass wool. Following processing, the ampoule is sealed using 
a flame. As the ampoule has a very thin neck, it is much easier 
to seal. Ampoules can also be purchases prescored which 
malces cracldng them open very easy as compared to tubes. 
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Chapter 1 2 



Microbes from Extreme Environment: Moiecuiar 
identification Procedures 

Surajit Das 

Abstract 

Extremophiles are the organisms that inhabit in the extreme habitats like very high temperature, low 
temperature, high pressure, or cardinal conditions of pH, salinity, or adverse nutrient conditions. To 
survive in adverse conditions, these microorganisms develop certain unique mechanisms like secretion of 
certain enzymes or proteins creating a unique interest for biotechnological applications. Utilizing these 
microorganisms for the production of these useful substances at industrial scale, proper identification and 
characterization is required. Though certain conventional techniques are available for this purpose but 
they are of limited scope and it requires proper validations and modification. Molecular approach has 
broadened the limited range and DNA-based methods are emerging as the more reliable, simple, and 
inexpensive ways to identify and classify microorganism. Hence, in this chapter some identification 
techniques of these extreme organisms at molecular level have been described. 



12.1 Introduction 



Extreme environments include high temperature, pH, pressure, 
salt concentration, and low temperature, pH, nutrient concentra- 
tion, and water availability, and also conditions having high levels 
of radiation, harmful heavy metals, and toxic compounds (organic 
solvents). Extremophiles are those microorganisms which inhabit 
in these environments. The discovery of Thermus aquaticus from 
Yellowstone National Park in 1965 by Professor Thomas Brock 
paved the way for thermophilic microbiology and in 1968 Profes- 
sor K. Horikoshi coined the term alkaliphiles to describe the 
microorganisms thrive at alkaline pH. Thermophiles and alkali- 
philes are now just two examples of extremophiles, microorgan- 
isms capable of living under extreme conditions. Others groups 
include acidophilic, barophilic, psychrophilic, and halophilic 
microorganisms . 

These extremophiles are adapted to living at 100 °C in volca- 
nic springs, at low temperatures in the cold polar seas, at 
high pressure in the deep sea, at very low and high pH values 
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(pH 0-1 or pH 10-11), or at very high salt concentrations 
(-35%). The majority of the organisms that grow in these extreme 
environments belong to a group with distinct characteristics [ 1 ] . 
Carl Woese named this group archaea and postulated the archaea 
as the third domain of life on earth, different from bacteria and 
eukarya. In many cases microbial biocatalysts, especially of extre- 
mophiles, are superior to traditional catalysts, because they allow 
the performance of industrial processes even under harsh condi- 
tion, under which conventional proteins are completely dena- 
tured. By virtue of their positive properties, stability, specificity, 
selectivity, and efficiency, enzymes already occupy a prominent 
position in modern biotechnology. 

For many processes in the chemical and pharmaceutical indus- 
tries, suitable microbial enzymes can be found that have the poten- 
tial to optimize or even replace chemical processes. By using robust 
enzymes in biotechnical processes one is often able to better utilize 
raw materials, minimize pollutant emissions, and reduce energy 
consumption while simultaneously improving quality and purity of 
products, e.g., optically pure compounds. The additional benefits 
in performing industrial processes at high temperature include 
reduced risk of contamination, improved transfer rates, lower vis- 
cosity, and higher solubility of substrates. Extremophiles and their 
cell components, therefore, are expected to play an important role 
in the chemical, food, pharmaceutical, paper, and textile industries 
as well as environmental biotechnology. 

Therefore, proper identification of the extremophilic strains 
is very much important. Conventional identification methods are 
available for only limited range of archaeal species. However, 
molecular approach to identification has broadened the limited 
range. DNA-based methods are emerging as the more reliable, 
simple, and inexpensive ways to identify and classify microorgan- 
ism [2]. Extremophilic bacteria can be identified following Eluo- 
rescence in situ hybridization (EISH) with rRNA-targeted probes. 
Polymerase chain reaction-Denaturing gradient gel electrophore- 
sis (PCR-DGGE) [3], terminal restriction fragment length poly- 
morphism (T-RELP) [4], 16S rRNA library construction, and 
16S rDNA sequencing [5]. Two methods can be used for fungal 
ecology study: terminal restriction fragment length polymor- 
phism (T-RELP) analysis and denaturing gradient gel electropho- 
resis (PCR-DGGE) and for identification of actinomycetes strains, 
follow 16S rRNA library construction and 16S rDNA sequencing 
protocol and/or culture the colonies on solid medium, extract the 
genomic DNA, amplify 16S rDNA in PCR, sequence the 16S 
rRNA gene, compare the sequence from public database (e.g., 
NCBI or RDP), calculate the percentage of similarity and con- 
struct the phylogenetic tree [6]. 



12 Microbes from Extreme Environment: Molecular Identification Procedures 



155 



12.2 Materials 



12.2.1. Fluorescence 
In Situ Hybridization 

12.2.1.1. For Fixation 
and Preparation of 
Sediment Sampies 



12.2. 1.2. Hybridization 
on Filter Sections and 
Counterstaining 



1 . Microcentrifuge for 2 ml tubes 

2. Vacuum pump 

3 . Ultrasonication probe 

4. 2-ml screw-top microfuge tubes 

5. 2-ml microfuge tubes 

6. Plastic petri dishes (diameter, 5 cm) 

7. White polycarbonate membrane filters (diameter, 25 mm, 
pore size 0.2 pm) 

8. Cellulose nitrate support filters (diameter, 25 mm, pore size 
> 0.45 pm) 

9. Filter tower for 25 mm membrane filters 

10. lx PBS 

11. Ethanol 

12. 4 % (w/v) formaldehyde solution 

1. 2-ml microfuge tubes 

2. 0.5 -ml microfuge tubes 

3. 50 ml polyethylene tubes and rack 

4. Blotting paper 

5 . Razor blades 

6. Plastic petri dishes 

7. Microscopic slides -i- cover slips 

8. ClTlFLUORmountant 

9. VECTA SHIELD mountant 

10. 1 MTris/HCl,pH 7.4 

11. Formamide 

12. 0.5 M EDTA, pH 8 

13. 10 % (w/v) sodium dodecyl sulfate (SDS) 

14. 5 M NaCl solution 

15. 4',6-Diamidino-2-Phenylindole (DAPl) dissolved in distilled 
H 2 O, final concentration, 1 pg/ml 

16. 80 % (v/v) ethanol 
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12.2. 1.3. Hybridization 
with Horseradish 
Peroxidase (HRP)-Labeied 
Probes and Tyramide 
Signai Ampiification (TSA) 



12.2.2. PCR- 
Denaturing Gradient 
Gel Electrophoresis 



1. 2 -ml microflige tubes 

2. 0.5 -ml microflige tubes 

3. 50 ml polyethylene tubes and rack 

4. Blotting paper 

5 . Razor blades 

6. Plastic petri dishes 

7. Microscopic slides -i- cover slips 

8. Parafilm 

9. DAPI (4',6-diamino-2-phenylindole dihydrochloride) 

10. 1 MTris HCl, pH 7.4 

11. Formamide 

12. 0.5 M EDTA, pH 8 

13. 10 % (w/v) sodium dodecyl sulfate (SDS) 

14. 5 M NaCl solution 

1 . Formamide (deionized); Add 10 g of mixed bed resin (Sigma) 
to 100 ml formamide in an erlenmeyer and stir for 30-60 min. 
Decant or filter the formamide to separate it from the resin 
beads. Store the deionized formamide in volumes of 35 ml at 
— 20 °C for the preparation of the gel solution. 

2. Acrylamide/bis-acrylamide stock solution (37.5:1; 40 % w/v): 
It can be purchased or prepare the solution from acrylamide 
powder (filter the solution and store at 4 °C in a dark bottle). 

3. DGGE acrylamide/bis-acrylamide solutions (for a gradient of 
30-65 % of denaturant) 

(a) 10 % acrylamide/0 % denaturant 

• 2.5 ml 50x TAE 

• 62.5 ml acrylamide/bis-acryl 40 % 

• 185 ml H 2 O distilled 

(b) 8 % acrylamide/0 % denaturant 

• 2.5 ml 50x TAE 

• 50 ml acrylamide/bis-acryl 40 % 

• 197.5 ml H 2 O distilled 

(c) 10 % acrylamide/100 % denaturant 

• 2.5 ml 50x TAE 

• 62.5 ml acrylamide/bis-acryl 40 % 
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• 100 ml formamide 

• 105 g Urea + H 2 O for 250 ml 
(d) 8 % acrylamide/100 % denaturant 

• 2.5 ml 50x TAE 

• 50 ml acrylamide/bis-acryl 40 % 

• 100 ml formamide 

• 105 g Urea + H 2 O for 250 ml 

Note: These solutions have to be filtered through a 0.45-)tm 
filter and stored at 4 °C. 



4. Mixed gel solutions for a DGGE gel 

30 % denaturing solution: 7.2 ml of 8 % acrylamide/100 % 
denaturant solution + 16.8mlofa8% acrylamide/0 % dena- 
turant solution [then, add 32 gl TEMED (N,7V,N',M-Tetra- 
methylethylenediamine) and 75 gl 10 % APS (Ammonium 
persulfate)]. 

65 % denaturin£i solution: 15.6 ml of 10 % acrylamide/100 % 
denaturant solution - 1 - 8.4 ml of 10 % acrylamide/0 % dena- 
turant solution (then, add 32 gl TEMED and 75 gl 10 % 
APS). 

Stacking gel solution: Solution of 8 % acrylamide/0 % dena- 
turant (10 ml per gel); add 20 gl TEMED and 40 gl 10 % APS. 

Note: These solutions have to be degassed for 15 min under 
vacuum and stored at 4 °C in the dark. TEMED and 10 % APS 
are added before loading the gel. 

5. lOx Gel loading solution: Bromophenol blue (0.025 g; 

0.25 % w/v); xylene cyanole (0.025 g; 0.25 % w/v); glycerol 
(5 ml; 100 % v/v); water 5 ml 



12.2.3. Terminal 
Restriction Fragment 
Length Polymorphism 



1. Extraction buffer (100 mM Tris, 100 mM EDTA, 100 mM 
sodium phosphate buffer, pH 8.0) 

2. Proteinase K (10 mg/ml) 

3. Lysozyme (100 mg/ml) 

4. Sodium Dodecyl Sulfate (10 %) 

5. 5 MNaCl 

6. 5 % CTAB (hexadecylmethylammonium bromide) 

7. Isopropanol 

8. 10 M ammonium acetate (pH 7.5) 

9. PGR reagents 

10. Deionized formamide 
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12.2.4. 16S rRNA 
Library Construction 
and IBS rONA 
Sequencing 



12.3 Methods 

12.3.1. Fluorescence 
In Situ Hybridization 

12.3.1.1. Fixation 
and Preparation of 
Sediment Sampies 



11. ROX-labeled GS500 internal size standard (Applied 
Biosystems) 

12. Loading buffer 

13. 5 % polyacrylamide gel 

1. 0.12 M sodium phosphate buffer (PB buffer pH 8.0) 

2. Lysis solution 1 (0.15 M NaCl, 0.1 M EDTA, pH 8.0, 
10 mg lysozyme ml/ 1 ) 

3. Lysis solution 11 (0.1 M NaCl, 0.5 M Tris-HCl, pH 8.0, 
12 % SDS) 

4. 5 M NaCl 

5. 10 % TBJTON-XlOO in 0.7 M NaCl 

6 . CHCI 3 

7. Isoamyl alcohol 

8.13% PEC (polyethylene glycol dissolved in 1.6 M NaCl) 

9. 70 % ethanol 
10. Deionized H 2 O 



1. Suspend 0.5 ml of freshly collected sediment in 1.5 ml of 4 % 
formaldehyde solution in a 2 -ml screw-top microfuge tube. 

2. Fix for 1-24 h. 

3. Centrifuge at 10,000 rpm for 5 min, pour off supernatant. 

4. Add 1.5 ml of PBS and resuspend sample. 

5. Centrifuge at 13,000 x ^for 5 min, pour off supernatant. 

6 . Add 1.5 ml of a 1;1 mix of PBS/ethanol and store sample at 
—20 °C until further processing. 

7. Resuspend sample and transfer 20-100 pi of aliquot to 500 pi 
of a 1;1 mix of PBS/ethanol in a 2-ml microfuge tube. 

8 . Sonicate aliquot for 20-30 s at low intensity using 1-s sonica- 
tion pulses. 

9. Place cellulose nitrate support filters beneath the membrane 
filters to improve the distribution of cells. 

10. Add 15-20 pi of aliquot from the sonicated sample to 2 ml of 
distilled water and filter this volume onto the membrane filters. 

11. Air-dry filtered preparations and store in petri dishes at 
—20 °C until hybridization. 
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Table 12.1 

Composition of hybridization buffer 

Formamide (%) Formamide (til) NaCI 5M + Tris HCI 1M + SDS 10 % H 2 O (til) 




12 . 3 . 1 . 2 . FISH 
Probe-EUB388 



12.3.1.3. Preparation 
of Hybridization Buffer 
and Washing Buffer 



12.3.1.4. FISH onto 
Polycarbonate Filters 



1 . Specificity: Most bacteria and archaebacteria 

2. Target molecule: 16S rRNA 

3. Position: 338-355; T^y. 55 °C; Formamide: 0-50 % 

4. Sequence: 5'-GCT GCC TCC CGT AGG AGT-3' 

5. Labeling: 5' labeling with Fluorescein-Isothiocyanate (FITC) 

6. Prepare the probe solution: 2 pi probe + 18 pi Hybridization 
buffer/section 

1. Prepare the “Hybridization Buffer” of 0-50 % formamide in 
2 nil-eppendorf tubes following the instructions (Table 12.1) 
and heating at 48 °C for 10 min. 

2. Prepare the washing buffer of 0-50 % formamide in 2 ml- 
eppendorf tubes following the instructions (Table 12.2) and 
heating at 48 °C for 10 min. 

1. Filter the sample and cut in sections with a razor blade and 
number the each section with a pencil outside the filtered zone. 

2. Cover a microscope slide with Parafilm and then put 10 pi of 
the “Probe solution” onto the slide. 
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Table 12.2 

Composition of washing buffer 



Formamide (%) 


NaCI 5M + Tris HCI 1M + SDS 10 % 


H 2 O (pi) 


0 


360 pi + 40 pi + 2 pi 


1,598 


5 


252 pi + 40 pi + 2 pi 


1,498 


10 


180 pi + 40 pi + 2 pi 


1,398 


15 


127.2 pi + 40 pi + 2 pi 


1,298 


20 


86 pi + 40 pi + 2 pi 


1,198 


25 


59.6 pi + 40 pi + 2 pi 


1,098 


30 


40.8 pi + 40 pi + 2 pi 


998 


35 


28 pi + 40 pi + 2 pi 


898 


40 


18.4 pi + 40 pi + 2 pi 


798 


45 


12 pi + 40 pi + 2 pi 


698 


50 


7.2 pi + 40 pi + 2 pi 


598 



3. Leave the filter section onto the drop and then add 10 pi of 
“Probe solution” onto the filter section. 

4. Malce a dark humid chamber with a 50 ml-falcon tube with a 
folded piece of tissue inside and pour the rest of the “Hybri- 
dization buffer” onto the tissue. 

5 . Transfer the hybridization chamber immediately to the hybri- 
dization oven at the hybridization temperature for 1.5-3 h. 

6. After that, wash the filter in the “Washing buffer” for 10 min 
at 48 °C. 

7. Remove the excess washing buffer with distilled water in a 
Petri dish for 1-2 min in the dark. 

8 . Finally dry the filter section with filter paper and mount with 
coverslip. 



12.3.1.5. Counterstaining 
with DAP! 



1. Counterstain the filter section with DAPl (4',6-diamino-2- 
phenylindole dihydrochloride). 

2 . Place one drop of 1 . 5 pg/ml DAPl onto the filter section and 
then place the cover slip again onto the filter section. 

3. Observe the sample directly by an epifluorescence microscope. 
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12.3.2. PCR- 
Denaturing Gradient 
Gel Electrophoresis 

12.3.2. 1. Extraction of 
DMA from Environmental 
Samples and PCR 



1 . Extract the genomic DNA following the methods described 
elsewhere in this manual. 

2. Perform PCR amplification of the 16S rRNA fragment prior 
to DGGE using the following DGGE specific primers: F 
5'CGC CCG CCG CGC GCG GCG GGC GGG GCG 
GGG GCA CGG GGG GCC TAG GGG AGG CAG CAG 
3' and R 5'ACC GCG GCT GCT GG 3' [7], 

3. Prepare the PCR reaction mixture in a volume of 50 pi con- 
taining 1.0 U of polymerase, each primer at a concentra- 
tion of 0.5 pM and 50 ng DNA template. 

4. Amplification condition and PCR program is as follows: 

(a) 94 °C for 12 min 

(b) 94 °C for 1 min 

(c) 50 °C for 1 min 

(d) 72 °C for 2 min 

(e) Go to step 2, 24 times 

(f) 72 °C for 7.5 min 

(g) 15 °C for ever 

(h) End 

5. Check the PCR amplifications on 1 % (w/v) agarose gel 
before DGGE analysis. 



12.3.2.2. DGGE Conditions 



1. Perform DGGE by loading the purified PCR product onto 
an 8 % (w/v) polyacrylamide (ratio of acrylamide to bisacry- 
lamide, 37.5:1) gel with a gradient of 30-65 % denaturant 
(100 % denaturant contained 7 M urea and 40 % (w) 
formamide). 

2. Electrophorese the gels in lx TAE buffer at 30 V for 15 min 
and then 130 V for 4.5 h at 60 °C. 

3. Stain the gels in purified water (MilliQ or Millipore) contain- 
ing ethidium bromide (0.5 mg/L) and destain twice in 

0.5 X TAE for 15 min each. 

4. Capture the images using an image analyzing system. 



12.3.2.3. Extraction 
of DNA from Acrylamide 
Gels 



1. Excise the central 1 mm^ portion of strong DGGE bands 
using a razor blade and soak in 50 pi of purified water (MilliQ; 
Millipore) overnight at 4 °C 

2. Use 1 pi supernatant as template in a PCR-DGGE reaction 
preparing the following reaction mix 
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(a) 1 U of Advantage 2 polymerase mix 

(b) 2 pi lOx buffer 

(c) 2 pi MgCb (2 mM) 

(d) 1.5 pi dNTP (1.5 pM each) 

(e) 10 pmol of each primer (341F-5'-CGC CCG CCG CGC 
GCG GCG GGC GG GGC GGG GGC CGG GGG GC 
CTA CGG GAG GCA GCA G-3' and 534R-5'ATT ACC 
GCG GCT GCT GG-3' 

(f) 0.5 pi formamide 

(g) 1 pi template DNA 

(h) Double distilled water 25 pi 

3. Set the program accordingly 

(a) 34 cycles 

(b) 20 sat 96 °C 

(c) 45s60°C 

(d) 45s72°C 

(e) 1 cycle 

(f) 5 min at 72 °C 

(g) Check the products in a 1 % agarose/lx TBE gel and 
then purify using the PGR clean-up centrifugal device 



12.3.2.4. Sequencing of 
Purified PCR Products 
and Cioning 



1. Sequence the PCR products derived from strong DGGE 
bands, free of minor bands direcdy using the purified PCR 
product. 

2. Perform the sequencing reaction. 

3. Clone the amplification products into either pCR 11 or pCR4- 
TOPO plasmid vectors. 

4. Transform the ligated products into competent 
Escherichia coli. 

5. Extract the DNA from the recombinant (white) colonies and 
use the aqueous DNA solution as DNA template in the PCR 
reaction. 

6. Analyze the products by gel electrophoresis and digest with 
restriction enzyme EcoRl to check the presence of different 
cloned bands. 

7. Select the high frequency sequences in clone libraries for 
sequence analysis and phylogenetic tree construction. 
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12.3.2.5. Phylogenetic 
Analyses 



1. Analyze the percent similarity of DNA sequences through 
BLAST. 

2. Perform the multiple sequence alignment using Clustal X 
software. 

3. Construct the phylogenetic and molecular evolutionary trees 
to get the species affiliation using maximum parsimony, 
neighbor- joining, or maximum-likelihood analyses with 
bootstrap values. 



12.3.3. Terminal 
Restriction Fragment 
Length Polymorphism 

12.3.3.1. Isolation 
of Community DNA 



1. Resuspend the soil sample (5 g) in 12 ml of extraction buffer. 

2. Add 100 pi proteinase K (10 mg/ml) and 180 pi of lysozyme 
(100 mg/ml). 

3. Shake the sample at 37 °C for 30 min followed by the addition 
of 3 ml of 10 % SDS, 4.5 ml of 5 M NaCl, and 1.5 ml of 5 % 
CTAB (hexadecylmethylammonium bromide)/!. 5 M NaCl 
and incubate further at 65 °C for 15 min. 

4. Freeze the samples in liquid nitrogen and subsequently thaw 
at 65 °C for a further 15 min. Repeat this freeze-thaw cycle 
twice. 

5. Centrifuge at 4,200 rpm for 10 min at 48 °C. 

6. Transfer the resulting supernatant in a fresh sterile tube. 

7. Resuspend the remaining pellet by vortexing for 10 s in 4 ml 
of extraction buffer, 1 ml of 10 % SDS, and then incubate at 
65 °C for 10 min. 

8. Centrifuge at 4,200 rpm for 10 min at 48 °C. 

9. Remove the supernatant and combine with that from the first 
centrifugation. 

10. Add an equal volume of phenol:chloroform:isoamylalcohol 
(25:24:1) to the combined supernatant and mix by inversion. 

11. Centrifuge the sample at 4,200 rpm for 10 min at 48 °C 
and transfer the aqueous upper layer to a fresh sterile tube. 

12. Add an equal volume of chloroform:isoamylalcohol (24:1), 
mix by inversion, centrifuge at 4,200 rpm for 10 min. 

13. Transfer the aqueous phase and add isopropanol (0.7 vol.) 
and 0.3 vol. of 10 M ammonium acetate (pH 7.5) at room 
temperature to the supernatant. 

14. Centrifuge at 12,000 rpm for 30 min. 

15. Discard the supernatant and resuspend the pellet in 1.5 ml of 
70 % ethanol in an Eppendorf tube. 
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12.3.3.2. PCR Amplification 



12.3.3.3. T-RFLP Analysis 



12.3.3.4. Species 
Richness, Community 
Signature, and 
Community Similarity 



16. Centrifuge at 14,000 rpm for a further 15 min. 

17. Discard the supernatant and the air-dry the pellet before 
resuspension in 250 ml of sterile distilled water. 

1. Perform PCR using 2.5 U of Taq DNA polymerase with the 
Archaeal primers: 25 F (5' CYG GTT GAT CCT GCC RG 3') 
and 1,492 R (5' GGT TAG CTT GTT ACG ACT T 3') [8]. 

2. Amplification condition and PCR program is as follows: 

(a) 96 °C for 5 min 

(b) 95 °C for 15 s 

(c) 49 °C for 30 s 

(d) 72 °C for 1 min 

(e) Go to step 2, 24 times 

(f) 72 °C for 5 min 

(g) 15 °C for ever 

(h) End 

3. Visualize PCR product after electrophoresis on 0.8 % agarose 
TAB gel. 

4. Purify the PCR product by the Montage^^ PCR centrifugal 
filter device (Millipore Corp., USA) and the DNA was eluted 
in a final volume of 50 pi. 

1. Digest the purified PCR product (10 pi) was with 20 U of 
either Alul or Hhal in a total volume of 15 ml at 37 °C for 
3h. 

2. Mix this restriction digestion (2 ml) with 2 ml of deionized 
formamide, 0.5 ml of ROX-labeled GS500 internal size stan- 
dard (Applied Biosystems), and 0.5 ml of loading buffer. 

3. Denature each sample by heating at 95 °C for 5 min and 
immediately transferring to ice. 

4. Perform electrophoresis from aliquots (1.5 ml) of each 
digested product in a 36-cm 5 % polyacrylamide gel contain- 
ing 7 M urea for 6 h at 3,000 V. 

5 . Visualize the RFLPs by staining with SYBER green 1 dye and 
then photograph. 

6. Determine the lengths of fluorescently labeled T-RFs by com- 
parison with internal standards by using GeneScan software 
(version 2.1) (Applied Biosystems). 

1 . Estimate the similarity of communities by visual comparison 
of the electropherograms or by numerically analyzing the 
pattern of T-RELPs in gel images with GelCompare software 
(Applied Maths, Belgium). 
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12.3.4. IBS rRNA 
Library Construction 
and IBS rONA 
Sequencing 

12.3.4.1. DMA Extraction 
and Purification 



2 . Generate a stacked band pattern of T-RFLPs for each com- 
munity from the same terminus following the normalization 
(no background subtraction) and image analysis steps of the 
software. 

3. Produce the similarity matrix for the pattern fragments in the 
samples by calculating Jaccard coefficient. (The Jaccard coef- 
ficient considered the presence or absence of bands and the 
number of T-RFs in common in communities, as well as the 
total number of T-RFLPs observed). 

4. Obtain a similarity dendrogram by the unweighted pair group 
method using average linkages. (A value of 0 means that the 
community fingerprints are completely different from one 
another and a value of 1 indicates that they are identical.) 



1. Add 5 g soil sample to a 50-ml tube containing 5 ml 0.12 M 
sodium phosphate buffer (PB buffer pH 8.0). 

2. Vortex for 1 min, incubate at room temperature for 10 min, 
and centrifuge at 7,700 x ^for 10 min. 

3. Decant the supernatant and add 5 ml PB buffer to the soil 
pellet. 

4. Repeat once the procedure of vortexing, incubation, and 
centrifugation. 

5. Resuspend the soil pellet in 8 ml lysis solution 1 (0.15 M 
NaCl, 0.1 M EDTA, pH 8.0, 10 mg lysozyme), mix and 
incubate at 37 °C for 1 h with occasional gentle mixing. 

6. Add 8 ml lysis solution 11 (0.1 M NaCl, 0.5 M Tris-HCl, pH 
8.0, 12 % SDS) and pass the soil suspension through two 
cycles of freezing at —40 °C for 20 min and thawing at 
65 °C for 20 min and then centrifuge at 7,700 x ^ for 
10 min. 

7. Mix the supernatant with 2.7 ml 5 M NaCl and 2.1 ml 10 % 
TRITON-XIOO in 0.7 M NaCl and incubate for 10 min at 
65 °C. 

8. Add an equal volume of Chloroform:isoamyl alcohol (24:1) 
and centrifuge for 5 min at 3,000 x 

9. Transfer the supernatant to a clean tube. 

10. Add an equal volume of 13 % PEG (polyethylene glycol dis- 
solved in 1.6 M NaCl) to the supernatant, incubate on ice for 
30 min, and centrifuge at 12,000 x ^for 10 min. 
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12.3.4.2. PCR Amplification 



12.3.4.3. Construction 
of 16S rRNA Library and 
16S rONA Sequencing 



11. Decant the supernatant and wash the pellet with 5 ml 70 % 
cold ethanol and air-dry. 

12. Resuspend the pellet of crude DNA extracts in 500 pi deio 
nized H 2 O. 

13. Quantify the DNA yield by DNA fluorometer after 0.8 % 
(w/v) agarose gel electrophoresis. 

14. Store the DNA at —20 °C until required for PCR 
amplification. 

1. Use the crude DNA as template for the amplification 
of archaeal 16S rRNA genes via PCR. 

2. The reaction mixture (25 pi) contains the following com- 
ponents: 

(a) 100 ng of genomic DNA 

(b) Primers at 0.5 pM each. Archaeal primers are 25 F 
(5' CYG GTT GAT CCT GCC RG 3') and 1,492 R 
(5' GGT TAG CTT GTT ACG ACT T 3') 

(c) dATP, dTTP, dCTP, and dGTP each at a concentration 
of 200 pM 

(d) mMMgCl 

(e) 1 U DNA polymerase 

(f) PCR buffer 

3. Amplification condition and PCR program is as follows: 

(a) 94 °C for 12 min 

(b) 94 °C fori min 

(c) 50 °C for 1 min 

(d) 72 °C for 2 min 

(e) Go to step 2, 24 times 

(f) 72 °C for 7.5 min 

(g) 15 °C for ever 

(h) End 

1. PCR products of 16S rRNA genes were cloned directly into 
the pCR12.1-TOP01, pCR 11, or pCR4-TOPO plasmid 
vectors. 

2. Transform the ligated products into competent E. coli. 

3. Detect the positive clones by the appearance of white colonies 
in LB plates containing 40 mg/ml of XGal and 50 pg/ml 
ampicillin. 
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12.3.4.4. Molecular 
Identification and 
Phylogenetic Analysis 



4. Extract the DNA from the recombinant (white) colonies and 
use the aqueous DNA solution as DNA template in the PCR 
reaction. 

5. Analyze the products by gel electrophoresis and digest with 
restriction enzyme EcoRI to check the presence of different 
cloned bands. 

6. Select the high frequency sequences in clone libraries for 
sequence analysis and phylogenetic tree construction. 

7. Isolate the recombinant plasmids from overnight cultures by 
alkaline lysis and a restriction analysis with EcoRI to detect the 
insertion. 

8. Analyze the products by gel electrophoresis and digest with 
restriction enzyme EcoRI to check the presence of different 
cloned bands by digestion patterns. 

9. Select the high frequency sequences in clone libraries for 
sequence analysis and phylogenetic tree construction. 

1. Check all the sequences for chimeras using the CHIMERA- 
CHECK online analysis program of the RDP-II database. 

2. Determine taxonomic hierarchy of the sequences by BLAST 
of NCBI and RDP Analysis Tools of Ribosomal Database 
Project-II Release 9 (http://rdp.cme.msu.edu/index.jsp). 

3. Perform multiple sequence alignment with CLUSTAL X 
selecting related sequences from the NCBI Taxonomy 
Homepage (Tax-Browser) and Ribosomal Database Project- 
II databases. 

4. Construct the phylogenetic trees using the Neighbor-joining 
method with 500/1,000 Bootstrap replications. 

5. Sequences that differ by <3 % are considered to belong to the 
same phylotype. 

6. Sequences with similitude percentages below 95 % are 
assigned to the closest family. 
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ELISA-Based Identification and Detection of Microbes 

Jyoti Vertna, Sangeeta Saxena, and Sunil G. Babu 



Abstract 

Enzyme-linked immunosorbent assay (ELISA) is an analytical technique to detect the presence of an 
antigen or antibody in a given sample. It shows wider applications in clinical diagnosis, in pathological 
studies, and in quality control studies. Virtually all microbial species have unique antigen(s), and such type 
of antigen(s) can be exploited as specific molecules of detection by ELISA. The variations in ELISA allow 
us to detect either antigen or antibody, identifying the different strains of microbes at a time and also in 
characterization of the epitope distribution on the microbial surface. Five types of variants have been 
developed in this assay: (1) Direct ELISA-use in the detection of antigen; (2) Indirect ELISA-use in 
the detection of antibody; (3) Sandwich ELISA-use in identification of different epitopes at a time; 
(4) Competitive ELISA-use in quantifying the antigen/antibody, and (5) Multiplex ELISA-use in 
identification of multiple antigens/antibodies at a time. Here, we discuss about different variants of 
ELISA and add detailed steps in performing these ELISA methods with their advantages and limitations. 



13.1 Introduction 



Enzyme -Linked Immunosorbent Assay (ELISA) is a highly sensi- 
tive immunological technique with high specificity for the detec- 
tion of particular antigen (Ag) or antibody (Ab) in a sample using 
enzyme-linked antibodies [I, 2]. ELISA was developed in 1970 
and became a popular technique in laboratory diagnosis. It pro- 
vides an ideal system for dealing with a wide range of studies in 
Biology attributing to its flexibility, whereby reactants can be used 
in different combinations, either attached passively to a solid phase 
support or in the liquid phase. An ELISA is a five-step procedure: 
(I) coat the microtiter plate wells with antigen; (2) block all 
unbound sites to prevent false-positive results; (3) add antibody 
to the wells; (4) add anti-antibody conjugated to an enzyme; and 
(5) reaction of a substrate with the enzyme to produce a colored 
product, thus indicating a positive reaction. ELISA has been used 
as a diagnostic tool in medicine and plant pathology, as well as 
quality control check in various industries. ELISA combines the 
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13.1.1. Principle 



specificity of antibodies with the sensitivity of simple enzyme 
assays, as the enzyme-coupled antibodies are used in this method. 
The conjugated/linked enzyme reacts when a chromogenic or 
fluorogenic substrate is added and produces a color/signal that 
can be read through the ELISA reader. Generally, ELISAs 
are performed in 96/12 well strip formats which permits high- 
throughput results. The bottom of each well is coated with 
antigen or antibody to be tested, and then by adding enzyme- 
conjugated specific antibody followed by suitable substrate, 
color developed is measured and quantitated. For colorimetric 
assays the popular enzymes conjugated to the antibodies 
are alkaline phosphatase, horseradish peroxidase, urease, and (I- 
galactosidase, etc. 

Serological typing is an immuno-based technique and plays an 
important role in identifying the bacteria, based on the difference 
of antigenic determinants expressed on their cell surface [3, 4]. 
These surface antigens include lipopolysaccharides, capsular poly- 
saccharides, membrane proteins, and extracellular organelles, such 
as flagella and fimbriae [5]. At present, ELISA is the most estab- 
lished immunological technique, from which the indirect/sand- 
wich ELISA formats are the most commonly used ELISA formats 
for the detection of pathogens [6]. 

ELISA techniques have been developed for the detection of 
whole cell antigen targets or products for pathogens such as 
Bacillus cereus [7], Campylobacter spp. [8], Escherichia coli [9], 
and Salmonella spp. [10] from foods. Obst et al. [II] developed 
an ELISA using a monoclonal antibody against the enterobacterial 
common antigen (EGA), a lipopolysaccharide linked within the 
outer cell membrane of Enterobacteriaceae. 

The use of immunological methods for the detection of spe- 
cific microorganisms is a rapid and simple technique, the accuracy 
of which mainly depends on the specificity of the antibody [12]. 
One way to increase specificity is to select monoclonal antibodies 
which are highly specific in their action against a specific epitope. 
However, since an epitope can be present in more than one 
antigenic agent, a rigorous specificity testing of the monoclonal 
antibodies synthesized with closely and distantly related bacterial 
strains must precede the routine testing of environmental samples. 

The basic principle of ELISA is to use an enzyme attached to 
antibody to detect either antigen or antibody in an immunoassay 
to allow quantification through the development of color after the 
addition of a suitable chromogenic substrate. ELISA is used to 
target the antigen with high specific antibodies which are conju- 
gated to the enzymes. These enzyme-conjugated antibodies are 
allowed to bind to the respective antigens, and then after the 
addition of the suitable chromogenic substrate to this (Ag-Ab-E 
complex), color will be produced due to the enzymatic activity. 
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13.1.2. Types of ELISA 



13.1.2.1. Direct ELISA 



13.1.2.1.1. Advantages 

13.1.2.1.2. Limitations 



13. 1.2.2. Indirect ELISA 



The enzymatic reaction stopped and the color developed will be 
detected spectrophotometrically by ELISA reader. 

The ELISA method is a benchmark for quantitation of pathologi- 
cal antigens and there are indeed many variations to this method. 
ELISAs are adaptable to high-throughput screening because 
results are rapid, consistent, and relatively easy to analyze. ELISAs 
can be performed with a number of modifications to the basic 
procedure. Eive types of ELISA formats have been developed, 
allowing us to estimate the qualitative and quantitative measure- 
ment of either antigen or antibody. They are 

1 . Direct ELISA 

2. Indirect ELISA 

3. Sandwich ELISA 

4. Competitive ELISA 

5 . Multiplex ELISA 

Direct ELISA is a method for detecting and measuring a particular 
antigen in a sample using a specific antibody conjugated to the 
enzyme. As the name indicates, the antigen is directly recognized 
and identified by antibody which itself is conjugated to the 
enzyme. Briefly, any microbial antigen(s) in purified form or com- 
plex form are coated on the surface of the microtiter well and 
these wells are allowed to incubate with the specific enzyme- 
conjugated antibodies, and after addition of the suitable substrate 
color will be developed. The development of the color shows the 
presence of antigen and the intensity of the color denotes the 
quantity of Antigen. The protocol of direct ELISA is illustrated 
in Eig. 13. 1. 

It is relatively quick because only one antibody and fewer steps are 
used. It also eliminates the cross-reactivity of secondary antibody. 

The labeling of every primary antibody is a time-consuming and 
expensive proposition. Immunoreactivity of the primary antibody 
might be adversely affected by labeling with enzymes or tags. 
Eurther, certain antibodies are unsuitable for labeling. There is 
no flexibility in the choice of primary antibody label from one 
experiment to another and a minimal signal amplification in con- 
trast to the methods that use secondary antibody labeling. 

The indirect ELISA is a two-step method using labeled secondary 
antibody for detection. Eirst a primary antibody detects the anti- 
gen and then a secondary labeled antibody binds to the primary 
antibody that is used for color development. Briefly, sample con- 
taining primary antibody (Abi) is added to an antigen-coated 
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Fig. 13.1 Direct ELiSA 



13.1.2.2.3. Advantages 



Antigen is passively adsorbed to solid phase of 
microtitre plate by incubation 

Ag 
= Ag 



I Wash and add labeled Ab 
conjugated to an enzyme 




l 




microtiter well and allowed to react with the antigen attached to 
the well. After washing, the enzyme -conjugated secondary anti- 
body (Ab2) which is specific to the isotype of primary antibody is 
added. After removing any excess antibody by washing, the addi- 
tion of the substrate yields color, which can be measured through 
ELISA reader. The protocol of indirect ELISA is illustrated in 
Fig. 13.2. The indirect ELISA is a method of choice to detect 
the presence of serum antibodies against HIV. 

It offers the advantage that any number of antisera can be exam- 
ined for binding to a given antigen using a single anti-species 
conjugate. This property has been heavily exploited in diagnostic 
applications, particularly when examining (screening) large num- 
bers of samples. The signal amplification is also improved due to 
the use of secondary antibodies. 
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Fig. 13.2 Indirect ELISA 



Antigen is passively adsorbed to solid phase of 
microtitre plate by incubation 
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13.1.2.2.4. Limitations 



13. 1.2.3. Sandwich ELISA 



13.1.2.3.5. Advantages 



13.1.2.3.6. Limitations 



13. 1.2.4. Competitive 
ELISA 



One limitation with this type is that cross-reactivity may occur 
with secondary antibody resulting in nonspecific signal. An extra 
incubation step is required in the procedure. 

The sandwich ELISA measures the amount of antigen which is 
sandwiched between two layers of antibodies. Briefly, an antibody 
(the “capture” antibody) is allowed to bind to a solid phase, i.e., 
bottom of a micro titer well. Then a test sample containing an 
antigen is added and allowed to react with the bound antibody. 
After washing the well, a labeled second antibody (the “detec- 
tion” antibody) which is specific to another epitope is added and is 
allowed to bind to the antigen, thus completing the “sandwich”. 
After the removal of any free antibody, substrate is added and the 
color can be measured. The basic protocol of sandwich ELISA is 
illustrated in Eig. 13.3. 

It offers fast and accurate detection of antigen concentration in an 
unknown sample. Especially the antigen does not need to be 
purified prior to use. For example the strain identification of 
microorganisms or pathogens is generally done with this assay 
during epidemics. It is useful for quantization of antigens when 
antigen concentration is low and/or has higher concentrations of 
contaminating protein(s). Antigens need not be purified prior to 
use. This method gives reproducible results with high sensitivity. 

One major disadvantage is that not all antibodies can be used. The 
antigens to be measured must contain at least two antigenic sites, 
capable of binding to the antibody as two antibodies act in the 
sandwich. For this reason, sandwich assays are restricted to the 
quantization of multivalent antigens such as proteins or polysac- 
charides. 

This type of ELISA is used to quantify antigen using competitive 
method. Briefly, the antigen to be measured is incubated with 
controlled amount of antibody to form antigen-antibody com- 
plex. Then this antigen-antibody complex is added to an antigen- 
coated microtiter well. The more antigens present in the sample, 
the less free antibody will be available to bind to the antigen 
coated well. Now addition to the enzyme-conjugated secondary 
antibody specific to the primary antibody gives the amount of 
primary antibody bound to the well, which in turn gives the 
amount of antigen present in the sample. The protocol of com- 
petitive ELISA is illustrated in Fig. 13.4. In this type of ELISA, 
the higher the antigen concentration in the sample ,the weaker the 
final signal, as less free primary Abs are available to bind to antigen 
on plate due to the competition for binding the antigen. 
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Fig. 13.3 Sandwich ELISA 



Ab is passively adsorbed to solid phase of 
microtitre plate by incubation 
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13.1.2.4.7. Advantages 

13.1.2.4.8. Limitations 

13.1.2.5. Multiplex ELISA 




Fig. 13.4 Competitive ELiSA 



The main advantage is the use of non-purified primary antibodies. 

Higher the original antigen concentration, the weaker the even- 
tual signal. The more antigens in the sample, the less labeled 
antigen is retained in the well and the wealcer the signal for some 
competitive ELISA. 

The sandwich ELISA allows detecting the two different epitopes 
present on the antigen, suggesting the possibility of detection of 
multiple epitopes present on the antigen or sample(s) in the 
microtiter plate. The logical progression of the ELISA toward a 
protein array format which allows simultaneous detection of mul- 
tiple antigens at multiple array addresses within a single microtiter 
well leads to the development of multiplex ELISA. Multiplexing 
in assays simply refers to the ability to output multiple readings 
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Microtiter plate 




Add sample cantainirtg different 
antigens and wash 




Each spot is a different assay {ti'l, IL-2, etc.) 
With the addition ofsubstrat. If antigen is 
present the spot emits light If no antigen is 
f present the spot is not visible 






Intensity of spot is measured using 
optical device. 




Fig. 13.5 Multiplex ELISA 

from a single sample. Generally, multiplexing can be achieved 
tlirough antibody array, where different primary antibodies are 
printed on glass plate to capture their respective antigens in given 
samples such as cell lysate, tissue extract, etc., allowing a single 
microtiter well to capture different antigens at a time. The protocol 
of multiplex ELISA is illustrated in Fig. 13.5. These different anti- 
gens captured are detected by direct, indirect, sandwich, or compet- 
itive ELISA depending upon antibody array technique used. 

1 3.1 .2.5.9. Advantages Compared with the traditional ELISA, the multiplex arrays have a 
number of advantages including (a) high-throughput multiplex 
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13.1.2.5.10. Limitations 



13. 1.2.6. Applications 



analysis (Up to 25 Assays/well), (b) requirement of less sample 
volume, (c) efficiency in terms of time and cost, (d) ability to 
evaluate the levels of a given molecule in the context of multiple 
others, (e) ability to perform repeated measurements of the same 
formats in the same subjects under the same experimental assay 
conditions, and (f) the reliability of the detection of different 
proteins across a broad dynamic range of concentrations. 

In spite of the advantages, caution is necessary when considering 
the application of multiplex arrays. It requires experience/exper- 
tise to perform. Multiplex assays, by their very nature, involve 
potential interactions between multiple different antibodies and 
antigens in the sample/assay solution. One cannot assume that a 
reliable uniplex assay can just be simply added to a functioning 
multiplex assay. Non-reactivity to all other antibodies must first be 
established and the lowest amount possible must be used to 
minimize such cross-reactions. 

The advent in the production of monoclonal/polyclonal anti- 
bodies, enzyme/fluorescent/chemiluminiscent conjugation to 
these antibodies, and development of solid phase immobilization 
of Ags made ELISA a versatile technique. Further, the variations 
developed in basic ELISA format made it most sensitive, handy, 
and robust, done within a shorter time, both easily and economi- 
cally with reproducible results. The best results have been 
obtained with the sandwich format, utilizing highly purified, 
pre-matched capture and detector antibodies. 

1. ELISA is a preferable technique than other immunological 
techniques because ofits sensitivity (-0.0001-0. 01 pg ab/ml; 
can be increased to ~0.0000I-0.0I pg ab/ml) and is used in 
the detection and quantification of substances like peptides, 
proteins, antibodies, hormones, haptens, drugs, and their 
metabolites and of potential food allergens (milk, peanuts, 
walnuts, almonds, and eggs) which are more applied nowa- 
days to check GM food. 

2. Indirect ELISA is the most common and widely used tech- 
nique in clinical diagnostics where it is used for detection of 
serum antibody concentrations, e.g., detection for the pres- 
ence of antibodies in blood sample for past exposure to disease 
and outbreaks, such as Lyme disease, trichinosis, HIV, bird 
flu, etc. 

3. Sandwich ELISA allows identifying the different strains of 
pathogen where the pathogen/antigen is limited. 

4. Competitive assays are often used when the antigen to be 
measured is small and has only one epitope, or antibody 
binding site. 
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13.2 Materials 



13.2.1. Buffers 

13.2. 1. 1. Coating Buffer 



13.2.2. Reagents 



96-Well Microtiter Plates; ELISA Reader; Microplate shaker; 
Multichannel pipettor; Micrpipetts and Eppendorf Tubes; Multi- 
plex ELISA-based systems. 



1. (0.1 M Carbonate bicarbonate buffer, pH 9.6); 1 1 

(a) 3.03gNa2CO3 

(b) 6.0gNaHCOs 

(c) 1,000 ml distilled water 

(d) Store at 4 °C 

2. Phosphate buffer Saline (PBS), pH 7.4; 1 1 

(a) 2.32gNa2HP04 

(b) 0.2gKCl 

(c) 0.2gKsPO4 

(d) 8.0 g NaCl (500 ml distilled water) 

(e) Store at 4 °C 

3. Blocking buffer solution; 1 1 

(a) 1,000 ml PBS buffer 

(b) 0.5 ml Tween 20 

(c) 10 g BSA 

(d) Store at 4 °C 

(e) ( 1 % BSA, serum, nonfat dry milk, casein, gelatin in PBS) 

4. Washing solution; 1 1 

(a) [PBS or Tris-buffered saline (pH 7.4) with detergent 
such as 0.05 % (v/v) Tween 20] 

(b) 0.5 ml Tween 20 

(c) 999.5 ml PBS buffer 

(d) Store at 4 °C 

5 . Antibody dilution buffer 

Primary and secondary antibody should be diluted in 
1 X blocldng solution. 

Purified antigens/antibodies, Enzyme-conjugated (alkaline phos- 
phatase, Horseradish peroxidase) antibodies (Primary/Second- 
ary), PNPP (^-Nitrophenyl phosphate, Disodium salt), 
ARTS (2,2'-Azinobis [3-ethyl benzothiazoline-6-sulfonic acid]- 
diammonium salt), OPD ( O-phenylenediamine dihydrochloride). 
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13.3 Methods 

13.3.1. Direct ELISA 

13.3.1.1. Coating 



13.3.1.2. Blocking 



13.3.1.3. Incubation 



13.3.1.4. Detection 



13.3.2. Indirect ELISA 

13.3.2.1. Coating 



TMB (3,3',5,5'-tetramethyl benzidine), Hydrogen peroxide, 
Tween 20, and above-mentioned chemicals for the preparation 
of buffers. 



1. Dilute the antigen with coating buffer and coat wells of 
ELISA plate 100 pl/well (see Notes 1). 

2. Cover the plate with plastic and incubate 2 h at room temper- 
ature or overnight at 4 °C. 

3. The antigen coating solution from the wells of plate by 
flicking the plate over sink. Remove and wash the plate thrice 
by filling the wells with 200 pi of washing buffer/PBS (see- 
Notes 2). 

1 . Block the nonspecific protein binding sites in the coated wells 
by adding 200 pi blocking buffer/5 % serum in PBS/well 
(see Notes 3). 

2. Cover the plate with plastic and incubate 2 h at room temper- 
ature or overnight at 4 °C. 

3. Remove the blocking buffer from the wells by flicldng the 
plate over sink and wash the plate thrice with PBS. 

1 . Add 100 pi of the antibody conjugated to enzyme in blocking 
buffer to each well (see Notes 4). 

2 . Cover the plate with plastic and incubate 2 h at room temperature . 

3. Remove the enzyme-conjugated antibody solution from the 
wells by flicking over the sink and wash the plate four times 
with PBS/washing buffer. 

1. Add 100 pi or 50 pi of substrate (for ALP or HRP accord- 
ingly) per well with multichannel pipette (see Notes 5). 

2. After sufficient color development, add 50 pi of stopping 
buffer ( 8 N H 2 SO 4 ) (see Notes 6 ). 

3 . Read the absorbance of each well at 492 nm with a Microplate 
reader (see Notes 5). 



1. Dilute the antigen with coating buffer and coat wells of 
ELISA plate 100 pl/well (see Notes 1). 




13.3.2.2. Blocking 



13.3.2.3. Incubation 



13.3.2.4. Detection 



13.3.3. Sandwich 
ELISA 

13.3.3.1. Coating 
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2. Cover the plate with plastic and incubate 2 h at room temper- 
ature or overnight at 4 °C. 

3. Remove the antigen coating solution from the wells of plate 
by flicking the plate over sink and wash the plate thrice by 
filling the wells with 200 pi of washing buffer/PBS. 

1 . Block the nonspecific protein binding sites in the coated wells 
by adding 200 pi blocking buffer/5 % serum in PBS/well. 

2. Cover the plate with plastic and incubate 2 h at room temper- 
ature or overnight at 4 °C. 

3. Remove the blocking buffer from the wells by flicldng the 
plate over sink and wash the plate thrice with PBS. 

1. Add 100 pi of primary antibody or antiserum diluted in 
blocking buffer to each well (see Notes 4). 

2. Cover the plate with plastic and incubate 2 h at room temper- 
ature or overnight at 4 °C. 

3. Remove the primary antibody solution from the wells by 
flicking over the sink and wash the plate at least thrice with 
PBS/washing buffer. 

4. Dilute the enzyme-conjugated secondary antibody with 
blocking buffer and add 100 pi to each well of the plate. 

5. Cover the plate with plastic and incubate for 2 h at room 
temperature. 

6. Remove the labeled secondary antibody solution from the 
wells by flicking over the sink and wash the plate at least five 
times with PBS/washing buffer. 

1. Add 100 pi or 50 pi of substrate (for ALP or HRP accord- 
ingly) per well with multichannel pipette (see Notes 5). 

2. After sufficient color development, add 50 pi of stopping 
buffer (8N H2SO4) (see Notes 6). 

3 . Read the absorbance of each well at 492 nm with a Microplate 
reader (see Notes 5). 



1. Dilute the antibody with coating buffer and coat wells of 
ELISA plate 100 pl/well (see Notes la). 

2. Cover the plate with plastic and incubate 2 h at room temper- 
ature or overnight at 4 °C. 

3. Remove the antibody solution from the wells of plate by 
flicking the plate over sink and wash the plate thrice by Ailing 
the wells with 200 pi of washing buffer/PBS. 
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13.3.3.2. Blocking 


1 . Block the nonspecific protein binding sites in the coated wells 
by adding 200 pi blocking buffer/5 % serum in PBS/well. 

2. Cover the plate with plastic and incubate at 4 °C overnight or 
2 h at room temperature. 

3. Remove the blocking buffer from the wells by flicldng the 
plate over sink and wash the plate thrice with PBS. 


13.3.3.3. Standard and 
Samples Incubation 


1 . Dilute the standards/samples with blocking buffer and trans- 
fer 100 pi to each well. 

2. Cover the plate with plastic and incubate at 4 °C overnight or 
2 h at room temperature. 

3. Remove the standard/samples solution from the wells by 
flicking the plate over sink and wash the plate thrice with PBS. 


13.3.3.4. Incubation 
with HRP-Conjugated 
Antibody 


1. Dilute the labeled antibody (0.25-2 pg/ml) in blocking 
buffer and add 100 pi in each well. 

2. Cover the plate with plastic and incubate for 1 h at room 
temperature. 

3. Remove the blocking buffer from the wells by flicldng the 
plate over sink and wash the plate thrice with PBS. 


13.3.3.5. Detection 


1. Add 100 pi or 50 pi of substrate (for ALP or HRP accord- 
ingly) solution per well. 

2. After sufficient color development, add 50 pi of stopping 
buffer ( 8 N H 2 SO 4 ). 

3 . Read the absorbance of each well at 492 nm with a Microplate 
reader. 


13.3.4. Competitive 
ELiSA 

13.3.4.1. Coating 


1. Dilute the antibody with coating buffer and coat wells of 
ELISA plate 100 pl/well (see Notes la). 

2. Cover the plate with plastic and incubate 2 h at room temper- 
ature or overnight at 4 °C. 

3. Remove the antibody solution from the wells of plate by 
flicking the plate over sink and wash the plate thrice by filling 
the wells with 200 pi of washing buffer/PBS. 


13.3.4.2. Blocking 


1 . Block the nonspecific protein binding sites in the coated wells 
by adding 200 pi blocking buffer/5 % serum in PBS/well. 

2. Cover the plate with plastic and incubate at 4 °C overnight or 
2 h at room temperature. 

3. Remove the blocking buffer from the wells by flicldng the 
plate over sink and wash the plate thrice with PBS. 




13.3.4.3. Competitive 
incubation 



13.3.4.4. Detection 



13.3.5. Multiplex ELISA 

13.3.5.1. Coating 



13.3.5.2. Blocking 



13.3.5.3. Incubation with 
the Antibody 



13.3.5.4. Detection or 
Coior Deveiopment 



13.3.5.5. Analysis of Data 
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1 . Dilute the standards/samples with blocking buffer and dilute 
the enzyme-conjugated antigen in the blocldng buffer. 

2. Mix the standard/sample and HRP-conjugated antigen 
together and add 100 pi of the diluted mixture to the wells. 

3. Cover the plate with plastic and incubate at 4 °C overnight 
or 2 h at room temperature. 

4. Remove the mixture solution from the wells by flicking 
the plate over sink and wash the plate thrice with PBS. 

1. Add 100 pi or 50 pi of substrate (for ALP or HRP accord- 
ingly) solution per well. 

2. After sufficient color development, add 50 pi of stopping 
buffer ( 8 N H 2 SO 4 ). 

3 . Read the absorbance of each well at 492 nm with a Microplate 
reader. 



1. Coat the wells of a microtiter plate with 100 pi of antigen of 
concentration 0.5-10 pg/ml in PBS or coating buffer and 
incubate for 2 h at room temperature. 

2. Remove the coating solution and wash the plate (see Notes 
1 and la). 

1. Block the wells by adding blocking solution (BSA/PBS) 
per each well. 

2. Cover the plate with plastic and incubate for 2 h at room 
temperature or overnight at 4 °C and wash the plate (see- 
Notes 3). 

1. Add the antibody conjugated to enzyme and cover the plate 
with plastic and incubate for 2 h at room temperature or 
overnight at 4 °C. 

2. Wash the plate (see Notes 2). 

1 . Fill the substrate solution per well with a multichannel pipette 
or a multichannel pipette. 

2. Read the absorbance of each well with a plate reader after 
color development (see Notes 5 and 6 ). 

1. Prepare a standard curve from the data produced from the 
serial dilutions with concentration on the X-axis vs. absor- 
bance on the T-axis. 

2. Interpolate the concentration of the sample from this stan- 
dard curve (see Notes 7). 
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13.4 Notes 

13.4.1. Coating 



13.4.2. Washing 



13.4.3. Blocking 



13.4.4. Incubation 



13.4.5. Detection 



1. Antigen or antibody should be diluted in coating buffer to 
immobilize them to the wells. Dilution is made according to 
the requirement; the concentration of coated antigen ranges 
from 1 to 10 gg/ml. Test samples containing pure antigen are 
usually pipetted onto the plate at <2 pg/ml. Antigen protein 
concentration should not be over 20 pg/ml, as this will 
saturate most of the available sites on the micro titre plate. 

(a) The concentration of coated antibody ranges from 0 . 5 to 
10 pg/ml 

(b) Avoid contamination so be very careful and avoid spill- 
over from adjacent wells 

1. Remove the coating solution and wash the plate thrice by 
filling the wells with 200 pi PBS. The solutions or washes 
are removed by flicking the plate over a sink. The remaining 
drops are removed by patting the plate on a blotting paper, 
(a) Make sure that all incubations are carried out in 100 % 
humid conditions. Do not allow the plate to dry out 
between intermediate washing and incubation period. 
Always refrigerate plates in sealed bags with a desiccant 
to maintain stability and moisture content. 

1 . Block the remaining protein binding sites in the coated wells 
by adding 200 pi blocking buffer and 5 % nonfat dry milk/ 
PBS per well. Alternative blocking reagents include BSA. 

(a) Although 2 h is usually enough to obtain a strong signal, 
if a weak signal is obtained, stronger staining will often 
observed when incubated overnight at 4 °C. 

1. The concentration of incubated antibody is based on the 
manufacturer’s instructions and is prepared immediately 
before use. 

1 . Four most popular enzymes used in ELISA are Alkaline phos- 
phatase (ALP), Horseradish peroxidase (HRP), Urease, and 
(I-galactosidase with respect to the substrate. For example, in 
case of alkaline phosphatase the substrate is ^-nitrophenyl- 
phosphate (/NPP), and for peroxidases, 2,2'-azino-bis 
(3-ethylbenzothiazoline-6-sulfonic acid) (ABTS), o-phenyle- 
nediamine (OPD), and 3,3',5,5'-tetramethylbenzidine base 
(TMB) are used as substrates. For most applications ^NPP 
(^-Nitrophenyl-phosphate) is the most widely used substrate 
for alkaline phosphatase and the yellow color of nitrophenol 
can be measured at 405 nm after 15-30 min incubation at 
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room temperature. This reaction can be stopped by adding 
equal volume of 0.75 M NaOH. The most commonly used 
substrate for HRP is hydrogen peroxide. Cleavage of hydro- 
gen peroxide is coupled to oxidation of a hydrogen donor 
which changes color during reaction. Others are OPD, a light 
sensitive substrate kept and stored in the dark and measured at 
492 nm, and ABTS (2,2'-azino-di-[3-ethyTbenzothiazoline- 
6 sulfonic acid] diammonium salt). The end product is green 
and the optical density can be measured at 416 nm. 

(a) ATP and HRP enzyme substrates are toxic if inhaled or 
swallowed. Avoid contact with skin. 

(b) Sodium azide is an inhibitor of horseradish peroxidase. 
Do not include sodium azide in buffers or wash solu- 
tions, if an HRP-labeled conjugate will be used for 
detection. 

(c) Avoid contact with the acidic Stop Solution and Sub- 
strate Solution, which contains hydrogen peroxide. 

2. Generally 10 min is enough for color development for 
HRP conjugates OPD-H2O2, when the reaction is stopped 
by 8N H2SO4 

(a) The antibody-enzyme conjugate cleaves the reagent and 
a color develops, and even a small of amount of bound 
enzyme, if given enough time, produces more color; 
hence the reaction needs to be terminated by using 
weak acid. Otherwise all samples would yield the same 
absorbance and would be rendered indistinguishable. 
After stopping the reaction when an optimal contrast 
has been reached, spectrophotometric reading yields 
quantifiable results. 

3. Prepare a standard curve from the data produced from the 
serial dilutions with concentrations on the X-axis vs. absor- 
bance on the T-axis. Interpolate the concentrations of the 
sample from this standard curve. 

(a) Be consistent when adding standards to the assay plate. 
Add standards first and then samples to calibrate the 
results. Add standards to plate in the order from low 
concentration to high concentration as this will mini- 
mize the risk of compromising the standard curve. 

(b) Results of ETISA can be measured both qualitatively and 
quantitatively; the qualitative results provide a simple 
positive or negative result in the form of a presence or 
absence of a visible colored product for a sample, 
whereas in quantitative ETISA, the optical density (or 
fluorescence) is measured and plotted into a standard 
curve, which is typically a serial dilution of the protein to 
be quantified. 
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4. Always prepare fresh stocks in limited quantity and discard any 
remaining solutions/samples. Always prepare fresh buffers at 
the correct pH. Samples should be properly stored and refri- 
gerated at 2-4 °C for 1-2 days. For long period, preserve or 
store at —20 °C. Allow all the components to come to room 
temperature before use. 

5. The detection systems are based on the utilization of flow 
cytometry, chemiluminescence, or electrochemiluminescence 
technology depending on the type of the label attached to the 
detecting antibody. The label can be multiple things such as 
fluorophors or enzymes. The most commonly used detections 
are Flow cytometric multiplex arrays, also known as bead- 
based multiplex assays, measured with chromogenic or 
fluorogenic emissions detected using flow cytometric analysis. 

6. Antigens, serum samples, and assay chemicals used in ELISA 
may be potentially infectious and/or toxic. Wear gloves and 
lab coat before handling all serum and plasma specimens of 
blood-borne infections. 
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Analysis of Microbial Diversity and Construction 
of Metagenomic Library 

Thangatnani Rajesh, Jeyaprakash Rajendhran, and 
Paramasamy Gunasekaran 

Abstract 

Culture independent metagenomic methods represent promising approach for the identification of novel 
genes and genomes of uncultivable microorganisms. A wide range of protocols have been developed for 
the isolation of community DNA from environmental sources. Three important steps in DNA isolation 
from soil are (a) cell lysis, (b) removal of contaminants, in particular humic acid, and (c) purification of 
isolated community DNA. Depending on the nature of the environmental DNA source, modifications are 
usually performed in any of the above steps. The steps involved in the isolation of DNA and ITNA from soil 
samples and the appropriate purification strategies are described in this chapter. Additionally, protocols for 
the microbial diversity analysis using 16S ribosomal RNA analysis and construction of metagenomic 
library for screening of functional genes from metagenomic DNA are given. 



14.1 Introduction 



Metagenomics studies in recent years had led to the discovery of 
novel enzymes, antibiotics, and uncultivable bacterial phyloge- 
netic groups. A basic prerequisite in metagenomic approaches is 
the recovery of environmental DNA at optimal quantity and 
purity [2]. Extensive methods have been developed for the recov- 
ery of environmental DNA at sufficient quantity for further appli- 
cations. Depending on the nature of the environment considered 
for investigation, technologies for purification of extracted envi- 
ronmental DNA have also been described [4, 5]. The protocols 
rely on the physical treatment, bead beating for cell lysis. The 
addition of glass beads of defined diameter and application of 
vigorous vibration cause rapid lysis of microbial cells in solution. 
As glass beads are of very small dimension, additional pressure is 
applied sufficiently to disrupt bacterial cells without causing any 
damage to the DNA. Following bead beating, the glass beads, 
together with soil and larger sized debris, are removed by 
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centrifugation. This allows the complete disruption of microbial 
cells, resulting in efficient lysis and recovery of total DNA. 

Other protocols involve release of the tightly bound cells and 
spores by means of ultrasonication. Employing ultrasonic waves at 
defined frequency, microbial cells are disrupted to release DNA. 
Since the frequency of the ultrasonic waves could be controlled, 
this method ensures complete lysis of bacterial cells, fungal cells, 
and spores. This method also facilitates the release of tightly 
bound cells from the soil particles. Other techniques allow the 
isolation of DNA from cells previously separated from soil. Lysis 
of bacterial cells in molten agarose plugs ensures the recovery of 
high molecular weight metagenomic DNA. In this process, the 
bacterial cells are embedded in low melting agarose plugs and the 
cells are lysed within the agarose gel matrix. Since the lysis is 
performed within the plugs, high molecular weight DNA is 
obtained, which is suitable for metagenomic library construction 
using BAG vectors. 

Co-purification of humic substances is a major problem in soil 
metagenomic DNA isolation. These contaminants are not 
completely removed during the DNA extraction protocols. 
Humic compounds interfere with downstream applications such 
as PCR and restriction digestion. Therefore, purification of the 
isolated DNA is essential for the desired applications. Recently, 
several simple and rapid purification methods have been reported 
for the successful removal of contaminants from metagenomic 
DNA [1, 3]. When sephadex beads are used for the purification, 
sephadex G-50 matrix exhibits differential binding properties to 
DNA and humic acid. Depending on the time of passage through 
this solid matrix and based on centrifugation speed, DNA could 
be recovered from contaminating substances. Other techniques 
involve the electroelution of DNA molecules separated in an 
agarose gel, which ensure high purity where the low melting 
agarose gels should be used. As the DNA molecules are entrapped 
in a solid matrix, physical force is required to release the DNA 
from the agarose gel. The application of electric field enables 
the release of DNA, which is called electroelution. The DNA 
sample is further subjected to dialysis to remove salts and concen- 
trated either by precipitation or using vacuum concentrators 
[6, 7]. Metagenomic studies help to understand the genetic 
resource of uncultured microbes. Metatranscriptomics deals with 
the understanding of microbial gene expression pattern in envi- 
ronmental microbial communities. The metatranscriptomics stud- 
ies involve the direct extraction of RNA (metagenomic RNA) and 
further downstream applications [8]. 

Available fingerprinting methods for microbial subtyping are 
ultimately based on the differences in the genome sequence. 
Therefore, DNA sequencing would appear to be the best 
approach to differentiate subtypes. Since it is impractical to 
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sequence the genome or every microbial strain, sequencing of 
stable marker genes is considered to be a potential strategy for 
microbial species identification. The rRNA is the highly conserved 
gene in all forms of life. Certain regions in the rDNA sequences 
are highly conserved even between distantly related organisms. 
This allows the precise positioning of distantly related organisms 
in the evolutionary tree and it also ensures the true measure of 
differences between closely related species. In bacteria, 16S rRNA 
sequences are used extensively to determine taxonomy and phy- 
logeny (evolutionary relationships) and to estimate rates of species 
divergence among bacteria. It is the widely used molecular chro- 
nometer to measure the evolutionary relationship as bacterial 
identification and differentiation are generally based on the ampli- 
fication of I6S rRNA gene sequences followed by sequencing 
and comparison of the sequence with known I6S sequences in 
databases. 

PCR cloning kits take the advantage of the terminal transfer- 
ase activity of Taq DNA polymerase. Taq DNA polymerase adds a 
single 3' -A overhang to both ends of the PCR product. The 
structure of these PCR products favors direct cloning into a 
linearized cloning vector with single 3'- dT overhang. Such an 
overhang at the vector cloning site not only facilitates cloning but 
also prevents the recircularization of the vector. As a result, more 
than 90 % of recombinant clones contain the vector with an insert. 
Recombinant clones are selected based on blue/white screening. 
By comparing the amplified I6S rRNA sequences, it is possible 
to estimate the historical branching order of the species, and 
also difference in the evolved sequences. The I6S rRNA- based 
phylogenetic tree can also be used to relate the three domains of 
life — bacteria, archaea, and eukarya. This provides information of 
lineage diverged from a common ancestral lineage. The lengths 
of the individual lines in the phylogenetic tree indicate the amount 
of sequence change. 

Metagenomic DNA constitutes a promising source of novel 
functional genes. The function- driven and the sequence-driven 
analyses are the two major approaches to screen metagenomic 
libraries. The function-driven analysis is based on the identifica- 
tion of clones that express a desired trait, followed by characteri- 
zation of the active clones by sequence and biochemical analysis. 
Sequence-driven analysis relies on the use of conserved DNA 
sequences to design hybridization probes or PCR primers to 
screen metagenomic libraries for clones that carry sequences of 
interest. Metagenomic libraries with larger DNA inserts such as 
bacterial artificial chromosome (BAG) libraries or cosmid libraries 
are useful for the sequence-based approaches. For functional 
approach, the number of functionally expressed genes should be 
as high as possible. Therefore, metagenomic libraries constructed 
using plasmid vectors such as pUCI9 are widely used for 
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functional screening. Here, the metagenomic DNA library con- 
struction using pUC19 is provided. The recombinant clones may 
be screened for different genes by function-based or sequence- 
based approaches. Library construction with environmental 
DNA, heterologous expression in a suitable host, and screening 
for desired phenotype based on phenotypic acquisition constitutes 
basic steps in functional metagenomics. By principle, a large 
quantity of pure DNA is considered for library construction as it 
offers the analysis of a relatively large subset of an environmental 
population. The DNA (~10 pg or more) is fragmented using a 
suitable restriction enzyme. The vector DNA (plasmid, cosmid, or 
BAG) is also restricted with the same DNA and is followed by 
ligation reaction. Transformation is then performed with the 
ligated product and the recombinants are screened for desired 
phenotype. Alternatively, short linkers can be attached to the 
restricted fragments (in case of insert DNA <3 kb) and PCR is 
performed with primers targeting the linkers. This leads to the 
uniform amplification of a complex mixture of larger sized DNA 
fragments using small amounts of starting material. This is due to 
the ligation of small DNA fragments leading to the formation of 
long linear and circular DNA concatamers. As this process leads to 
accumulation of large amounts of DNA, restriction digestion and 
ligation of this DNA would be efficient for library construction. 



14.2 Materials 



14.2.1. Isolation of 
Metagenomic DNA 
from Soil 

14.2.1.1. CTAB Method 



1. Extraction buffer (50 mM Na-phosphate buffer [pH 8], 
50 mM NaCl, 500 mM Tris-HCl [pH 8], 5 % sodium dodecyl 
sulfate) 

2. Phenol-chloroform-isoamyl alcohol (25:24:1) 

3. Sterile glass beads (0.25 mg of 0.1-mm diameter and 0.25 mg 
of 0.5 -mm diameter) 

4 . Chloroform-isoamyl alcohol (24:1) 

5. CTAB 

6. Isopropanol 

7. 70 % ethanol 

8. TE buffer (10 mM Tris; 1 mM EDTA [pH 8.0]) 



14.2.1.2. TNEP Method 



1 . TNEP (50 mM Tris, 20 mM EDTA, 100 mM NaCl) 

2. 1 % polyvinylpolypyrrolidone (w/v) 
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14.2. 1.3. Indirect Lysis 
Method 



14.2.2. Purification of 
Metagenomic DNA 

14.2.2. 1. Using Sephadex 
G-50 Beads 



14.2.2.2. Separation 
in Agarose Gel 



14.2.3. isolation 
of Metagenomic 
RNA from Soii 



3. Lysozyme 

4. Achromopeptidase 

5. 1%SDS 

6. Isopropanol 

7. 70 % ethanol 

8. TE buffer (10 mM Tris; 1 mM EDTA [pH 8.0]) 

1. ChelexlOO 

2. 0.1 % nadeoxycholate 

3. 2.5 % polyethylene glycol 

4. Sterile gauze bandage 

5. Lysis buffer (1 % sarkosyl, 1 % sodium deoxycholate, 1 mg/ml 
lysozyme, 10 mM Tris-HCl [pH 8.0], 0.2 M EDTA 
[pH 8.0], and 50 mM NaCl) 

6. ESP buffer (1 % Sarkosyl, 1 mg/ml proteinase K, and 0.5 M 
EDTA [pH 8.0]) 

7. 1 mM phenylmethylsulfonyl in isopropanol 

8. TE buffer (10 mM Tris-HCl with 50 mM EDTA [pH 8.0]) 

9. Isopropanol 



1. Sephadex G-50 beads 

2. 10 mM potassium phosphate buffer (pH 7.2) 

3. 10 mM potassium phosphate buffer 

4. TE buffer (10 mM Tris-HCl with 50 mM EDTA [pH 8.0]) 

1. Agarose (molecular biology grade) 

2. Scalpel 

3. Dialysis tubing 

4. 1 mM EDTA 

5. 2 7oNaHCOs 

6. lx TAE buffer 

7. TE buffer (10 mM Tris-HCl with 50 mM EDTA [pH 8.0]) 

1. 120 mM sodium phosphate buffer (pH 5.2) 

2. 1 % diethyl pyrocarbonate 

3. Denaturing solution (4 M guanidine thiocyanate, 25 mM 
sodium citrate, 0.5 % sarkosyl, 0.1 M 2-mercaptoethanol) 

4. 2 M sodium acetate (pH 4.0) 
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14.2.4. Amplification 
of 16S rRNA 
Sequences from 
Metagenomic DNA 



14.2.5. Cloning of 
16S rRNA Amplicon 



14.2.6. Phylogenetic 
Analysis of 16S rRNA 
Sequences 

14.2.7. Construction 
of Metagenomic 
Library 



5 . Chloroform-isoamyl alcohol (24:1) 

6. Isopropanol 

7. DEPC-treated sterile water 

1 . Metagenomic DNA 

2. 16S rDNA forward primer (8F: 5'-AGAGTTTGATCMT- 
GGCTCAG-3') 

3. 16S rDNA reverse primer (1522R: 5'-AAGGAGGTGATG- 
GANCCRCA-3') 

4. dNTPs 

5 . Taq DNA polymerase 

6. PGR reaction buffer 

1 . PGR amplified and purified product 

2. PGR cloning vector (pTZ57R/T, MBI Fermentas) 

3 . T4 DNA ligase 

4. 5x ligation buffer 

5 . Sterile deionized water 

6. Overnight culture of Escherichia coil DH5 

7. CaCb (100 mM/85 mM); MgCb (100 mM) 

8. LB medium 

9. Sterile microcentrifiige tubes and tips 

10. Sterile glycerol (15 %) 

11. LB Agar with ampicillin (100 pg/ml), X-gal (20 pg/ml), and 
IPTG (40 pg/ml) 

I. 16S sequences of the analyte in FASTA format 

1. Plasmid DNA(pUC18) 

2. Metagenomic DNA 

3. Restriction enzymes 

4. Restriction enzyme buffers 

5 . T4 DNA ligase 

6. T4 DNA ligase buffer 

7. Electrocompetent E. coli DHIOB cells 

8. Ampicillin (100 pg/ml) 

9. X-gal (20 pg/ml) 

10. IPTG (40 pg/ml) 

I I . LB Agar 
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14.3 Method 



14.3.1. Isolation of 
Metagenomic DNA 
from Soil by CTAB 
Method 



14.3.2. Isolation of 
Metagenomic DNA 
from Soil by TNEP 
Method 



1. To 0.6 g of soil, add 600 gl of extraction buffer and 300 gl of 
phenol-chloroform-isoamyl alcohol . 

2. Add 0.5 g sterile glass beads to the tubes and homogenize the 
soil with the extraction buffer in a mini-bead beater for 90 s at 
2,500 rpm. 

3. Centrifuge the homogenate at 16,000 x g for 2 min. 

4. Collect the supernatant and mix with equal volume of phe- 
nol-chloroform-isoamyl alcohol and centrifuge the tubes at 

6.000 X g for 5 min. 

5. Collect the supernatant and mix with an equal volume of 
chloroform-isoamyl alcohol and centrifuge at 16,000 x g 
for 5 min. 

6. Collect the supernatant and add NaCl to a final concentration 
of 1.5 M along with CTAB to 1 % and incubate the tubes at 
65 °C for 30 min. 

7. After the incubation, allow the solution to cool down to room 
temperature and mix the solution with an equal volume of 
chloroform-isoamyl alcohol (24:1), and centrifuge the tubes 
at 6,000 X g for 20 min. 

8. Collect the supernatant and add 0.6 volume of isopropanol 
and incubate at room temperature for 10 min. Centrifuge at 

16.000 X g for 10 min to pellet the precipitated DNA. 

9. Discard the supernatant and wash the DNA pellet with 70 % 
ethanol and collect the DNA by centrifugation at 16,000 x g 
for 5 min. 

10. After the air drying to remove traces of ethanol, resuspend the 
isolated DNA in 200 pi of TE buffer. 

1. To 0.5 g of soil, add 0.5 ml of TNEP buffer and 1 % poly- 
vinylpolypyrrolidone (w/v). 

2. Sonicate the samples for 7-10 min at a power setting of 15 W 
with 50 % active cycles. 

3. Eollowed by ultrasonication, add lysozyme and achromopep- 
tidase to a final concentration of 0.3 mg/ml. 

4. Incubate the tubes for 20 min at 37 °C and add SDS to a final 
concentration of 1 %. 

5. Incubate the suspension for 1 h at 60 °C, vortex the tubes for 
10 min, and centrifuge the suspension at 4,000 x g for 
5 min. 

6. Collect the supernatant and add 0.6 volume of isopropanol to 
precipitate the DNA. 
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14.3.3. Isolation 
of Metagenomic 
DMA from Soil 
by Indirect Lysis 
Method 



14.3.4. Purification 
of Metagenomic 
DMA by Using 
Sephadex 
G-50 Beads 



7. Wash the DNA with 70 % ethanol as described in the above- 
procedure and resuspend the DNA in a final volume of 
100 pi TE. 

1. To 50 g of soil, add 10 g ChelexlOO, 100 ml of 0.1 % nadeox- 
ycholate, and 2.5 % polyethylene glycol 6,000 in a 50 ml oak 
ridge tube. 

2. Allow the samples to shake for 1 h at 100 rpm at 4 °C with 
occasional rapid agitation by hand. 

3. Centrifuge the samples at 960 x g for 15 min to pellet the soil 
particles. 

4. Collect the supernatant and pass the supernatant through a 
sterile gauze bandage to remove ChelexlOO. 

5. Harvest the bacterial fraction by centrifugation at 22,000 x g 
for 20 min. 

6. Embed the harvested bacterial cells in low-melting-point aga- 
rose in a 1-ml syringe. 

7. Extrude the agarose plug from the syringe, add 10 ml of lysis 
buffer and 50 mM NaCl, and incubate the plugs for 1 h at 
37 °C. 

8 . Transfer the plug to 40 ml of ESP buffer and incubate the 
plugs for 16 h at 55 °C. 

9. After the incubation, inactivate proteinase K by adding 1 mM 
phenylmethylsulfonyl in isopropanol and incubate the plugs 
for 1 h at room temperature. 

10. Wash the plugs three times with TE buffer for every 10 min 
and store the plugs at 4 °C in 10 mM Tris-HCl with 50 mM 
EDTA(pHS.O). 

11. Transfer the agarose plugs into a petri dish containing 10 ml 
TE and wash the plugs three times by gentle shaking. 

12. After washing, add 5 ml of TE and 6 ml of isopropanol and 
incubate the plugs at 50 °C for 30 min. 

13. Collect the precipitated DNA by centrifugation at 4,000 x g 
for 5 min. 

14. Wash the DNA with 70 % ethanol as described in the above 
procedure and resuspend the DNA in a final volume of 100 |tl 
TE. 

1 . Wash and equilibrate 2 g of Sephadex G-50 beads with 10 ml 
of 10 mM potassium phosphate buffer (pH 7.2). 

2. Resuspend 500 pi of slurry in 1.5 ml of 10 mM potassium 
phosphate buffer. 
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14.3.5. Purification 
of Metagenomic 
DNA by Separation 
in Agarose Gel 



3. After thorough mixing, transfer 50 gl of the slurry in each 5 
microfuge tubes and centrifuge for 1 min at 300 x ^ to 
separate the overlaying buffer. 

4. Add the metagenomic DNA preparations at different concen- 
trations (100-500 gg in 1 ml TE) to microfuge tubes contain- 
ing Sephadex G-50 beads. 

5 . Mix the tubes by inverting the tubes for 1 5 min and allow the 
contents to pass through by incubating the tubes at room 
temperature for 5 min. 

6. The brown color in the metagenomic DNA preparation could 
be seen bound to the matrix. 

7. Collect the supernatant containing the DNA by centrifuga- 
tion of the tubes at 1,000 x ^for 1 min. 

8 . The metagenomic DNA can then be analyzed for its purity by 
spectrophotometric analysis or restriction digestion analysis. 

1. Resolve the Metagenomic DNA in 0.7 % agarose gel. 

2. Excise the DNA fragment from the gel with a clean, sharp 
scalpel. 

3. Preheat the dialysis tubing at 90 °C for 10 min in 1 mM 
EDTA/2 % NaHCOs. 

4. Rinse the tubes several times with sterile water prior to use. 

5. Equilibrate -300 mg of gel with 50 ml of fresh 1 x TAE buffer 
at 4 °C for 30 min. 

6. Remove the gel piece and place it lengthwise in a dialysis bag 
and add 400 gloflx TAE. 

7. Seal the dialysis bag and completely submerge it in an electro 
phoresis chamber. 

8. Perform electrophoresis at field strength of -4-5 V/cm. 

9. After 2 h, reverse the polarity of the electrodes for 1 min to 
dissociate the DNA bound to the membrane. 

10. Remove the dialysis bag from the electrophoresis chamber 
and dialyse against TE buffer for 2 h. 

1 1 . Collect the dialysed DNA and concentrate by vacuum con 
centrator. 



14.3.6. Isolation of 
Metagenomic RNA 
from Soil 



1. To 10 g of soil, add 20 ml of 120 mM sodium phosphate 
buffer (pH 5.2) and 1 % diethyl pyrocarbonate and allow the 
tubes for shaking at 150 rpm for 15 min. 

2. Centrifuge the tubes at 6,000 x ^ for 10 min. Discard the super- 
natant and wash the pellet once again with phosphate buffer. 

3. To the pellet, add 15 ml of denaturing solution and allow the 
tubes for shaking at 200 rpm for 1 min. 
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4. To the mixture, add 1.5 ml of 2 M sodium acetate (pH 4.0) 
and mix the samples. 

5 . Extract the total RNA with 1 5 ml phenol and 3 ml of chlor- 
oform-isoamyl alcohol. 

6. Vigorously shake the mixture to obtain a homogenous phase 
and leave the tubes containing the lysates on ice for 15 min. 

7. Following incubation, centrifuge the tubes at 10,000 x ^for 
20 min at 4 °C and extract the aqueous phase. To this aqueous 
phase, add equal volume of ice-cold isopropanol and store the 
tubes at —20 °C for 1 h to precipitate the total RNA. 

8. Recover the RNA containing pellets by centrifugation at 
10,000 X ^ for 20 min at 4 °C followed by vacuum drying. 

9. Resuspend the total RNA in 100 pi of DEPC-treated sterile 
water. 



14.3.7. Amplification of 
16S rRNA Sequences 
from Metagenomic 
DNA 



1 . Prepare a master mix with the following composition in the 
given order. A typical master mix for a 100 pi PCR is given 
below (Table 14.1). Modify the required volume proportion- 
ately. 

2. Aliquot 18 pi of the master mix in each vial. Add 2 pi of the 
genomic DNA from different strains in each tube. Keep one 
negative control without template DNA. Perform PCR using 
the following cycling conditions. 



Denaturation 


94 °C for 5 min 


Denaturation 


94 °C for 30 s 


Annealing 


55 °C for 30 s 


Extension 


72 °C for 2 min 


Extension 


72 °C for 5 min 


Elolding 


4 °C for 5 min 


End 



3. Add 2 pi of 6 X loading dye to the 10 pi of PCR amplified 
product and mix. 

4. Load each sample in a well in 1 .0 % agarose gel and run the gel 
at 100 V for about 90 min. 

5. Load 1 pi of 1 kb ladder as molecular weight marker to 
estimate the size of the amplified fragments. 

6. Document the image in Gel documentation system. 
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Table 14.1 

Composition of master mix for PCR 



Ingredients 


Volume (pi) 


10 X buffer 


10.0 


1 0 mM dNTPs mixture 


2.5 


5 (iM 8F primer 


10.0 


5 pM 1522R primer 


10.0 


Deionized water 


56.5 


Taq DNA polymerase (5 U/pl) 


1.0 


Total volume 


90.0 


Table 14.2 

Composition of ligation reaction 




Components 


Volume (pi) 


Vector pTZ57R/T 


2.0 


5 X Ligation buffer 


6.0 


Gel eluted PCR product 


12.5 


Deionized water 


7.5 


T4 DNA ligase (5 U/pl) 


2.0 


Total volume 


30.0 



14.3.8. Cloning of 16S Set up the ligation reaction as follows (Table 14.2) and incubate 
rRNA Amplicon the ligation mix at 16 °C overnight. 



14.3.8.1. Preparation of 
Competent Cells 



1. Inoculate 0.5 ml of E. coli DH5oc overnight bacterial culture 
into 50 ml of LB medium (antibiotic free). 

2. Grow up to early log phase at 37 °C in a shaker till the OD 
becomes 0.5-0. 6 at 600 nm (2-3 h). 

3. Centrifuge the culture in a sterile tube at 3,000 rpm for 5 min. 

4. Discard the supernatant; add 9 ml of MgCl 2 (100 mM) and 
mix gently to suspend the cells completely in the solution. 

5. Centrifuge at 3,000 rpm for 5 min at 4 °C. 

6. Suspend the cell pellet in 9 ml of ice-cold CaCl 2 (100 mM) 
and incubate on ice for 40 min. 



198 Rajesh, Rajendhran, and Gunasekaran 



14.3.8.2. Transformation 



14.3.9. Phylogenetic 
Analysis of 16S rRNA 
Sequences 



14.3.9.1. WSrRNA 
Sequence 



7. Centrifuge at 3,000 rpm for 5 min. Resuspend the cells in 
3.5 ml of FT buffer (85 mM CaCl 2 and 15 % glycerol). 

8. Malce 100 pi aliquots of the cells. Freeze the cells in dry ice/ 
ethanol or in liquid nitrogen. 

9. Store the cells at —80 °C (these cells will retain their compe- 
tence for 6-9 months). 

1 . Place the aliquots of competent cells on ice and allow the cells 
to thaw for 5 min. 

2. Add 10 pi of the ligation mix. 

3. Mix gently and incubate on ice for 40 min. 

4. Expose the cells for heat shock at 42 °C for 90 s and return to 
ice immediately. 

5. Add 900 pi of antibiotic-free LB medium to each tube and 
incubate at 37 °C for 1 h with shaking. 

6. Centrifuge the tubes at 3,000 rpm for 2 min and remove 
800 pi of the supernatant. 

7. Resuspend the pellet in the remaining 200 pi of medium. 

8. Spread 100 pi of resuspended cells onto LB agar plates con- 
taining ampicillin, X-gal, and IPTG. 

9. Incubate the plates in an inverted position at 37 °C overnight 
and screen the colonies after 10-12 h incubation. 

Sequence analysis should be performed with positive recombi- 
nants. With the obtained sequence, further analysis can be per- 
formed as follows: 

• Perform a BLAST analysis using the obtained sequence as 
query at the RDP database (http://rdp.cme.msu.edu/). 

• Select the top ten hits that represent high query coverage and 
percentage identity and retrieve the sequences in FASTA 
format. 

• Perform multiple sequence alignment with ClustalW program 
( http : / /www. ebi.ac.uk/Tools/msa /clustalw2 / ) . 

• Phylogenetic assessment of the obtained sequence can then be 
made using the query sequence and other retrieved sequences 
employing Tree Builder software (http://rdp.cme.msu.edu/ 
treebuilder/ treeing. spr). 

• An example of phylogenetic analysis is outlined below: 

attgaacgctggcggcaggcctaacacatgcaagtcgagcggcagcacgggtactt 

gtacctggtggcgagcggcggacgggtgagtaatgcctaggaatctgcctggtagt 

gggggataacgttcggaaacggacgctaataccgcatacgtcctacgggagaaagc 

aggggaccttcgggccttgcgctatcagatgagcctaggtcggattagctagttggt 

gaggtaatggctcaccaaggcgacgatccgtaactggtctgagaggatgatcagtca 



□□□□□□□ 
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no rank Root (0/20/109270) (select?d/match/total RDP sequences) 
domain Bacteria (0/20/107507) 

phylum 'Proteobacteria* (0/20/49878) 
dass Gammaproteobacteria (0/20/27579) 
order Pseudomonadales (0/20/7342) 
family Pseudomonadaceae (0/20/5683) 

»nus Pseudomonas (0/20/5486) 

■ S000000053 not.catailated 1.000 1395 Pseudomonas amygdali (T); lmg 2123T (type strain); 276654 

r S000381629 noCcalcuUted 0.986 1433 Pseudomonas syringae pv. atropurpurea; MAFF 301017; ABOOl 440 

r~ S000381633 not.cakuUted 0.965 1433 Pseudomonas syringae pv. macuMcola; /AAFF 302264; ABOOl 444 

S000381634 not.calcuUted 0.961 1435 Pseudomonas syringae pv. morsprunorum; MAFF 302280; AB001445 
r" S000381639 noCcalcuUted 0.961 1434 Pseudomonas syringae pv. theae; PTi ; ABOOl 450 

S000498035 not^cakuUted 0.966 1413 Pseudomonas syringae pv. tomato str. DC3000; AE016853 

r S000498037 not.calculated 0.966 1413 Pseudomonas syringae pv. tomato str. DC3000; AE016853 

r S000498039 not.calculated 0.966 1413 Pseudomonas syringae pv. tomato str. DC3000; AE016853 

l~ S000498042 DOt.calcuUted 0.966 1413 Pseudomonas syringae pv. tomato str. DC3000; AE016853 

S000498044 not.calcuUted 0.966 1413 Pseudomonas syringae pv. tomato str. DC3000; AE016853 

I” S000640019 not.calcuUted 0.964 1 345 Pseudomonas sp. Enf4; DQ339631 

I” S001611507 not.calcuUted 0.972 1 385 Pseudomonas syringae pv. porri; P55; FN554248 

I” S0021 55707 not.calcuUted 0.967 1247 Pseudomonas cannabina pv. alisalensis; BS91 ; GQ470207 

r~ S0021 55708 not.calcuUted 0.967 1247 Pseudomonas cannabina pv. alisalensis; 6S41 3; GQ470208 

r~ S0021 55709 not.calcuUted 0.962 1247 Pseudomonas cannabina pv. alisalensis; 6S1034; GQ470209 

r~ S0021 55710 not calcuUted 0.966 1247 Pseudomonas syringae pv. maculicoU; BS286: G0470210 

S0021 55712 not.calcuUted 0.967 1247 Pseudomonas cannabina pv. alisalensis; BS130; GQ470212 

S002155713 not.calcuUted 0.962 1247 Pseudomonas cannabina pv. alisalensis; BS416; GQ470213 

S002155714 not.calcuUted 0.967 1247 Pseudomonas syringae pv. tomato; BS287; GQ470214 

S002155715 not.calcuUted 0.962 1247 Pseudomonas cannabina pv. alisalensis; BS215; GQ470215 



Fig. 14.1 Computed output of RDP database. 



cactggaactgagacacggtccagactcctacgggaggcagcagtggggaatattg 

gacaatgggcgaaagcctgatccagccatgccgcgtgtgtgaagaaggtcttcgga 

ttgtaaagcactttaagttgggaggaagggcatttacctaatacgtaagtgttttgacg 

ttaccgacagaataagcaccggctaactctgtgccagcagccgcggtaatacagagg 

gtgcaagcgttaatcggaattactgggcgtaaagcgcgcgtaggtggtttgttaagt 

tgaatgtgaaatccccgggctcaacctgggaactgcatccaaaactggcaggctaga 

gtatggtagagggtggtggaatttcctgtctagcggtgaaatgcgtagatataggaa 

ggaacaccagtggcgaaggcgaccacctggactgatactgacactgaggtgcgaaa 

gcgtggggagcaaacaggattagataccctggtagtccacgccgtaaacgatgtcaa 

ctagccgttgggagccttgagctcttagtggcgcagctaacgcattaagttgaccgcc 

tggggagtacggccgcaaggttaaaactcaaatgaattgacgggggcccgcacaag 

cggtggagcatgtggtttaattcgaagcaacgcgaagaaccttaccaggccttgacat 

ccaatgaatcctttagagatagaggagtgccttcgggagcattgagacaggtgctgc 

atggctgtcgtcagctcgtgtcgtgagatgttgggttaagtcccgtaacgagcgcaa 

cccttgtccttagttaccagcacgttaaggtgggcactctaaggagactgccggtgac 

aaaccggaggaaggtggggatgacgtcaagtcatcatggcccttacggcctgggct 

acacacgtgctacaatggtcggtacagagggttgccaaaccgcgaggtggagctaat 

ctcacaaaaccgatcgtagtccggatcgcagtctgcaactcgactgcgtgaagtcgga 

atcgctagtaatcgcgaatcagaatgtcgcggtgaatacgttcccgggccttgtacac 

accgcccgtcacaccatgggagtgggttgcaccagaagtagctagtctaaccttcgg 

cgggacggttaccacggtgtgattcatgactggggtgaagtcgtaacaaggtagcc 

gt^ggggaacctgc 

• Submit this sequence to RDP database. The computed output 
will be as shown below (Fig. 14.1): 

• Select the top ten hits and retrieve the sequences in FASTA 
format. 
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.Pseudomonas aeruginosa; #796 

T 5 ,lPseudomonas aeruginosa; #885 

■rs" 6't.Pseudomonas aeruginosa; H814 
— -Pseudomonas alcaligenes; LB19 

-Pseudomonas pertucinogena (T); IFO 14163T 

Pseudomonas mendocina; ATCC 25413 

-Pseudomonas sp. AS-33 

-Pseudomonas putida; PI 

-Pseudomonas agarici; ATCC25941 

l—Pseudomonas chlororaphis (T); DSM 50083T (type strain) 

-Pseudomonas amygdali (T); LMG 2123T (type strain) 




Fig. 14.2 Phylogenetic tree construction with Tree Builder. 



Table 14.3 

Composition of restriction digestion mixture 



Components 


Volume (pi) 


pUC19 plasmid 


10.0 


10 X buffer 


2.0 


BamHl (10 U/pl) 


1.0 


Deionized water 


7.0 


Total volume 


20.0 



• ClustalW multiple sequence alignment with these sequences 
will yield results as shown below: 

• Employing these sequences, phylogenetic tree construction 
with Tree Builder software, the results will be obtained as 
shown below (Fig. 14.2): 



14.3.10. Construction 
of Metagenomic 
Library 

14.3.10.1. Restriction 
Digestion of Vector 

given below (Table 14.3): 

2. Incubate the reaction mixture at 37 °C for 2 h and inactivate 
reaction by heating at 70 °C for 10 min. 

3. Dephosphorylate the linearized vector (to prevent self-liga- 
tion) by adding 0.5 U of calf intestine alkaline phosphatase 
and incubate at 37 °C for 1 h. 

4. Purify the restriction digested and dephosphorylated vector 
by phenol/chloroform extraction or gel purification. 



1. Digest 5 pg of pUC19 plasmid with BamHl. Set the restric- 
tion digestion reaction on ice. A typical 20 pi reaction mix is 
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Table 14.4 

Composition of restriction digestion of metagenomic DNA 


Components 


Volume (pi) 


Metagenomic DNA 


10.0 


10 X buffer 


2.0 


BamHI (10 U/|il) 


1.0 


Deionized water 


7.0 


Total volume 


20.0 


Table 14.5 

Composition of ligation mixture 




Components 


Volume (pi) 


Dephosphorylated vector 


2.0 


5 X Ligation buffer 


2.5 


3-8 kb metagenomic fragments 


12.5 


Deionized water 


11.0 


T4 DNA ligase 


2.0 


Total volume 


30.0 



14.3.10.2. Restriction 


1. 


The isolated metagenomic DNA should be digested with the 


Digestion of Metagenomic 




same restriction enzyme. 


DNA 


2. 


Digest 5 pg of metagenomic DNA with BamHI. Set the 
restriction digestion reaction on ice. A typical 20 pi reaction 
mix is given below (Table 14.4): 




3. 


Incubate the reaction mixture at 37 °C for 2 h and inactivate 
the reaction by heating at 70 °C for 10 min. 




4. 


Resolve the digested DNA fragments on a 0.7 % agarose 
gel. Excise the fragments of about 3-8 kb and extract using 
QIAquick Gel purification kit. 


14.3.10.3. Ligation 


1. 


Set up the ligation reaction as follows and incubate the liga- 
tion mixture (Table 14.5) at 16 °C overnight. 




2. 


Inactivate the ligation by heating at 70 °C for 10 min and use 
the ligation mixture for the electrotransformation of E. coli 
DHIOB cells. 
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ATTGAACGCTGGCGGCAGGCCTAACACATGCAA 3 3 

ATTGAACGCTGGCGGCAGGCCTAACACATGCAA 3 3 

TAACACATGCAA 12 

■ — TTTGATCCTGGCTCAGATTGAACGCTGGCGGCAGGCCTAACACATGCAA 4 9 

■ — TTTGATCATGG-TCAGATTGAACGCTGGCGGCAGGCATAACACATGCAA 4 8 
■AGTTTGATCCTGGCTCAGATTGAACGCTGGCGGCAGGCCTAACACATGCAA 5 1 
■AGTTTGATCCTGGCTCAGATTGAACGCTGGCGGCAGGCCTAACACATGCAA 5 1 
■AGTTTGATCCTGGCTCAGATTGAACGCTGGCGGCAGGCCTAACACATGCAA 5 1 

ATTGAACGCTGGCGGCAGGCCTAACACATGCAA 3 3 

ATTGAACGCTGGCGGCAGGCCTAACACATGCAA 3 3 

ATTGAACGCTGGCGGCAGGCCTAACACATGCAA 3 3 



GTCGAGCGGCAGCACGGGTACTTGTACCTGGTGGCG-AGCGGCGGACGGGTGAGTAATGC 92 
GTCGAGCGGTAG-AGAGGTGCTTCCACCTCTTGAG — AGCGGCGGACGGGTGAGTAATGC 9 0 
GTCGAGCGGATG-AAGGGAGCTTGCTCCCGGATTC — AGCGGCGGACGGGTGAGTAATGC 69 
GTCGAGCGGATG-AGAGGAGCTTGCTCCTTGATTT — AGCGGCGGACGGGTGAGTAATGC 106 
GTCGAGCGGATG-AAGAGAGCTTGCTCTCTGATTC — AGCGGCGGACGGGTGAGTAATGC 105 
GTCGAGCGGATG-AAGGGAGCTTGCTCCTGGATTC — AGCGGCGGACGGGTGAGTAATGC 108 
GTCGAGCGGATG-AAGGGAGCTTGCTCCTGGATTC — AGCGGCGGACGGGTGAGTAATGC 108 
GTCGAGCGGATG-AAGGGAGCTTGCTCCTGGATTC — AGCGGCGGACGGGTGAGTAATGC 108 
GTCGAGCGGATG-AGTGGAGCTTGCTCCATGATTC — AGCGGCGGACGGGTGAGTAATGC 90 
GTCGAGCGGATG-ACGGGAGCTTGCTCCTTGATTC — AGCGGCGGACGGGTGAGTAATGC 90 
GTCGAGCGGAAG-AAGGGAGCTTGCTCCCGGATTC — AGCGGCGGACGGGTGAGTAATGC 90 



********* * * 



* * * * 



* 



*********************** 



CTAGGA-ATCTGCCTGGTAGTGGGGGATAACGTTCGGAAACGGACGCTAATACCGCATAC 151 
CTAGGA-ATCTGCCTGGTAGTGGGGGATAACGTTCGGAAACGGACGCTAATACCGCATAC 149 
CTAGGA-ATCTGCCTGGTAGTGGGGGACAACGTTTCGAAAGGAACGCTAATACCGCATAC 128 
CTAGGA-ATCTGCCTGGTAGTGGGGGATAACGTTCCGAAAGGAACGCTAATACCGCGTAC 165 
CTAGGA-ATCTGCCTGATAGTGGGGGACAACGTTTCGAAAGGAACGCTAATACCGCATAC 164 
CTAGGA-ATCTGCCTGGTAGTGGGGGATAACGTCCGGAAACGGGCGCTAATACCGCATAC 167 
CTAGGA-ATCTGCCTGGTAGTGGGGGATAACGTCCGGAAACGGGCGCTAATACCGCATAC 167 
CTAGGA-ATCTGCCTGGTAGTGGGGGATAACGTCCGGAAACGGGCGCTAATACCGCATAC 167 
CTAGGA-ATCTGCCTGGTAGTGGGGGACAACGTTTCGAAAGGAACGCTAATACCGCATAC 149 
CTAGGA-ATCTGCCTGGTAGTGGGGGACAACGTTTCGAAAGGAACGCTAATACCGCATAC 14 9 
CTGGGA-ATCTGCCTGGTAGTGGGGGATNNNGTCCGGAAACGGGCNNTAATACCGCGTAC 149 



** *** ********* ********** ** **** * * ********* *** 



GTCCTACGGGAGAAAGCAGGGGACCTTCGGGCCTTGCGCTATCAGATGAGCCTAGGTCGG 

GTCCTACGGGAGAAAGCAGGGGACCTTCGGGCCTTGCGCTATCAGATGAGCCTAGGTCGG 

GTCCTACGGGAGAAAGCAGGGGACCTTCGGGCCTTGCGCTATCAGATGAGCCTAGGTCGG 

GTCCTACGGGAGAAAGCAGGGGACCTTCGGGCCTTGCGCTATCAGATGAGCCTAGGTCGG 

GTCCTACGGGAGAAAGCAGGGGACCTTCGGGCCTTGCGCTATCAGATGAGCCTAGGTCGG 

GTCCTGAGGGAGAAAGTGGGGGATCTTCGGACCTCACGCTATCAGATGAGCCTAGGTCGG 

GTCCTGAGGGAGAAAGTGGGGGATCTTCGGACCTCACGCTATCAGATGAGCCTAGGTCGG 

GTCCTGAGGGAGAAAGTGGGGGATCTTCGGACCTCACGCTATCAGATGAGCCTAGGTCGG 

GTCCTACGGGAGAAAGCAGGGGACCTTCGGGCCTTGCGCTATCAGATGAGCCTAGGTCGG 

GTCCTACGGGAGAAAGCAGGGGACCTTCGGGCCTTGCGCTATCAGATGAGCCTAGGTCGG 

GTCCTACGGGAGAAAGCAGGGGATCTTCGGACCTTGCGCTATCAGATGAGCCCAGGCCGG 



211 

209 

188 

225 

224 

227 

227 

227 

209 

209 

209 



ATTAGCTAGTTGGTGAGGTAATGGCTCACCAAGGCGACGATCCGTAACTGGTCTGAGAGG 271 
ATTAGCTAGTTGGTGAGGTAATGGCTCACCAAGGCGACGATCCGTAACTGGTCTGAGAGG 269 
ATTAGCTAGTTGGTGAGGTAAAGGCTCACCAAGGCGACGATCCGTAACTGGTCTGAGAGG 248 
ATTAGCTAGTTGGTGAGGTAATGGCTCACCAAGGCGACGATCCGTAACTGGTCTGAGAGG 285 
ATTAGCTAGTTGGTGAGGTAATGGCTCACCAAGGCGACGATCCGTAACTGGTCTGAGAGG 2 84 
ATTAGCTAGTTGGTGGGGTAAAGGCCTACCAAGGCGACGATCCGTAACTGGTCTGAGAGG 2 87 
ATTAGCTAGTTGGTGGGGTAAAGGCCTACCAAGGCGACGATCCGTAACTGGTCTGAGAGG 2 87 
ATTAGCTAGTTGGTGGGGTAAAGGCCTACCAAGGCGACGATCCGTAACTGGTCTGAGAGG 2 87 
ATTAGCTAGTTGGTGAGGTAACGGCTCACCAAGGCGACGATCCGTAACTGGTCTGAGAGG 269 
ATTAGCTAGTTGGTGGGGTAATGGCTCACCAAGGCGACGATCCGTAACTGGTCTGAGAGG 269 
ATTAGCTTGTTGGTGAGGTAATGGCTCACCAAGGCGACGATCCGTAACTGGTCTGAGAGG 269 
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ATGATCAGTCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGG 331 
ATGATCAGTCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGG 329 
ATGATCAGTCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGG 308 
ATGATCAGTCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGG 345 
ATGATCAGTTACACTGGAACTGAGACACGGTCCAGACTCGTATGGGAGGCAGCAGTGGGG 344 
ATGATCAGTCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGG 347 
ATGATCAGTCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGG 347 
ATGATCAGTCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGG 347 
ATGATCAGTCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGG 329 
ATGATCAGTCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGG 329 
ATGATCAGTCACACTGGAACTGAGACACGGTCCANACTCCTACGGGANGCAGCAGTGGGG 329 
********* ************************ **** ** **** ************ 

AATATTGGACAATGGGCGAAAGCCTGATCCAGCCATGCCGCGTGTGTGAAGAAGGTCTTC 391 
AATATTGGACAATGGGCGAAAGCCTGATCCAGCCATGCCGCGTGTGTGAAGAAGGTCTTC 389 
AATATTGGACAATGGGCGAAAGCCTGATCCAGCCATGCCGCG TGTGTGAAGAAGGTCTTC 368 
AATATTGGACAATGGGCGAAAGCCTGATCCAGCCATGCCGCGTGTGTGAAGAAGGTCTTC 405 
AATATTGGACAATGGGCGAAAGCCTGATCCAGCCATGCCGTGTGTGTGAAGAAGGTCTTC 404 
AATATTGGACAANGGGCGAAANNNTGATCCAGCCATGCCGCGTGTGTGAAGAAGGTCTTC 407 
AATATTGGACAATGGGCGAAANNNTGATCCAGCCATGCCGCGTGTGTGAAGAAGGTCTTC 407 
AATATTGGACAATGGGCGAAAGCCTGATCCAGCCATGCCGCGTGTGTGAAGAAGGTCTTC 407 
AATATTGGACAATGGGCGAAAGCCTGATCCAGCCATGCCGCGTGTGTGAAGAAGGTCTTC 389 
AATATTGGACAATGGGCGAAAGCCTGATCCAGCCATGCCGCGTGTGTGAAGAAGGTCTTC 38 9 
AATATTGGACAATGGGGGAAACCCTGATCCAGCCATGCCGCGTGTGTGAAGAAGGTCTTC 389 
************ *** **** **************** ******************* 

GGATTGTAAAGCACTTTAAGTTGGGAGGAAGGGCATTTACCTAATACGTAAGTGTTT -TG 450 
GGATTGTAAAGCACTTTAAGTTGGGAGGAAGGGTACTTACCTAATACGTGAGTATTT -TG 448 
GGATTGTAAAGCACTTTAAGTTGGGAGGAAGGGCATTAACCTAATACGTTAGTGTTT -TG 427 
GGATTGTAAAGCACTT TAAGTTGGGAGGAAGGGCAGTAAGTTAATACCTTGCTGTTT -TG 4 64 
GGATTGTAAAGCACTTTAAGTTGGGAGGAAGGGCATTAACGTAATACGTTAGTGTTT -TG 4 63 
GGATTGTAAAGCACTTTAAGTTGGGAGGAAGGGCAGTAAGTTAATACCTTGCTGTTT -TG 4 66 
GGATTGTAAAGCACTTTAAGTTGGGAGGAAGGGCAGTAAGTTAATACCTTGCTGTTT -TG 4 66 
GGATTGTAAAGCACTTTAAGTTGGGAGGAAGGGCAGTAAGTTAATACCTTGCTGTTT -TG 4 66 
GGATTGTAAAGCACTTTAAGTTGGGAGGAAGGGCAGTAAGTTAATACCTTGCTGTTT -TG 44 8 
GGATTGTAAAGCACTTTAAGTTGGGAGGAAGGGCAGTAAGTT AATACCTTGCTGTTT -TG 448 
GGATTGTAAAGCACTTTAAGTTGGGAGGAAGGGCTTGCGGCTAATACCTCGCAAGTTTTG 449 
********************************* ****** * ** ** 

ACGTTACCGACAGAATAAGCACCGGCTAACTCTGTGCCAGCAGCCGCGGTAATACAGA GG 510 
ACGTTACCGACAGAATAAGCACCGGCTAACTCTGTGCCAGCAGCCGCGGTAATACAGAGG 508 
ACGTTACCGACAGAATAAGCACCGGCTAACTCTGTNCCAGCAGCCGCGGTAATACAGAGG 4 87 
ACGTTACCGACAGAATAAGCACCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGG 524 
ACGTTACCGACAGAATAAGCACCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGG 523 
ACGTTACCAACAGAATAAGCACCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGG 526 
ACGTTACCAACAGAATAAGCACCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGG 526 
ACGTTACCAACAGAATAAGCACCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGG 52 6 
ACGTTACCGACAGAATAAGCACCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGG 508 
ACGTTACCGACAGAATAAGCACCGGCTAACTCTGTGCCAGCAGCCGCGGTAATACAGAGG 508 
ACGTTACCAACAGAAT AAGCACCGGCTAACTCTGTGCCAGCAGCCGCGGTAATACAGAGG 509 
******** ********************** ** ******************* *** 

GTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGTGGTTTGTTAAGTTGA 570 
GTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGTGGTTCGTTAAGTTGG 568 
GTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGTGGTTGGTTAAGTTGG 547 
GTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGTGGTTCGTTAAGTTGG 584 
GTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGT GGTTTGTTAAGTTGA 583 
GTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGTGGTTCAGCAAGTTGG 58 6 
GTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGTGGTTCAGCAAGTTGG 58 6 
GTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGTGGTTCAGCAAGTT GG 58 6 
GTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGTGGTTCAGCAAGTTGG 568 
GTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGTGGTTTGTTAAGTTGG 568 
GTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCGCGTAGGTGGTTCAGTAAGATGG 569 
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ATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCCAAAACTGGCAGGCTAGAGTATGG 630 
ATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCCAAAACTGGCGAGCTAGAGTATGG 628 
ATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCCAAAACTGGCCAGCTAGAGTAGGG 607 
ATGTGAAAGCCCCGGGCTCAACCTGGGAACTGCATCCAAAACTGGCGAGCTAGAGTACGG 644 
ATGTGAAAGCCCCGGGCTCAACCTGGGAACTGCATCCAAAACTGGCAAGCTAGAGTATGG 643 
ATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCCAAAACTACTGAGCTAGAGTACGG 646 
ATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCCAAAACTACTGAGCTAGAGTACGG 64 6 
ATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCCAAAACTACTGAGCTAGAGTACGG 64 6 
ATGTGAAAGCCCCGGGCTCAACCTGGGAACTGCATCCAAAACTACTGAGCTAGAGTACGG 628 
ATGTGAAAGCCCCGGGCTCAACCTGGGAACTGCATCCAAAACTGGCAAGCTAGAGTACGG 628 
AAGTGAAATCCCCGGGCTTAACCTGGGAACTGCTTTCATAACTGCTGGGCTAGAGTACGG 629 



•* •*•*•*•*•*•* ********* ************** * ** **** 



********* 



* * 



TAGAGGGTGGTGGAATTTCCTGTCTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCA 690 
TAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCA 688 
TAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCA 667 
TAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCA 704 
CAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGTGTAGATATAGGAAGGAACACCA 703 
TAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCA 706 
TAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCA 706 
TAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCA 706 
TAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCA 688 
TAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCA 688 
TAGAGGGTAGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCA 689 



******* ************** ************* ********************** 



GTGGCGAAGGCGACC -ACCTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCA 749 
GTGGCGAAGGCGACC -ACCTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCA 747 
GTGGCGAAGGCGACC -ACCTGGACTCATACTGACACTGAGGTGCGAAAGCGTGGGGAGCA 726 
GTGGCGAAGGCGACCCACNTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCA 7 64 
GTGGCGAAGGCGACC -ACCTGGGCTAATACTGACACTGAGGTGCGAAAGCGTGGGGAGCA 762 
GTGGCGAAGGCGACC -ACCTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCA 765 
GTGGCGAAGGCGACC -ACCTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCA 765 
GTGGCGAAGGCGACC -ACCTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCA 765 
GTGGCGAAGGCGACC -ACCTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCA 747 
GTGGCGAAGGCGACC -ACCTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCA 747 
GTGGCGAAGGCGACT -ACCTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCA 748 



************** 



** *** ** ********************************** 



AACAGG-ATTA-GATACCCTGGTAGTCCACGCCGTAAACGATGTCAACTAGCCGTTGGGA 807 
AACAGG-ATTA-GATACCCTGGTAGTCCACGCCGTAAACGATGTCAACTAGCCGTTGGGA 805 
AACAGG-ATTA-GATACCCTGGTAGTCCACGCCGTAAACGATGTCAACTAGCCGTTGGGA 784 
AACAGGGNTTAGGATACCCTGGTAGTCCACGCCGTAAACGATGTCAACTAGCCGTTGGAA 824 
AACAGG-ATTA-GATACCCTGGTAGTCCACGCCGTAAACGATGTCGACTAGCCGTTGGGA 820 
AACAGG-ATTA-GATACCCTGGTAGTCCACGCCGTAAACGATGTCGACTAGCCGTTGGGA 823 
AACAGG-ATTA-GATACCCTGGTAGTCCACGCCGTAAACGATGTCGACTAGCCGTTGGGA 823 
AACAGG-ATTA-GATACCCTGGTAGTCCACGCCGTAAACGATGTCGACTAGCCGTTGGGA 823 
AACAGG-ATTA-GATACCCTGGTAGTCCACGCCGTAAACGATGTCGACTAGCCGTTGGGA 805 
AACAGG-ATTA-GATACCCTGGTAGTCCACGCCGTAAACGATGTCAACTAGCCGTTGGAA 805 
AACAGG-ATTA-GATACCCTGGTAGTCCACGCCGTAAACGATGTCAACTAGCCGTTGGAG 806 



****** *** *** 



***************************** ************ 



GCCTTGAGCTCTTAGTGGCGCAGCTAACGCATTAAGTTGACCGCCTGGGGAGTACGGCCG 867 
GCCTTGAGCTCTTAGTGGCGCAGCTAACGCATTAAGTTGACCGCCTGGGGAGTACGGCCG 865 
ACCTTGAGTTCTTAGTGGCGCAGCTAACGCATTAAGTTGACCGCCTGGGGAGTACGGCCG 844 
TCCTTGAGATTTTAGTGGCGCAGCTAACGCATTAAGTTGACCGCCTGGGGAGTACGGCCG 884 
TCCTTGAGATCTTAGTGGCGCAGCTAACGCATTAAGTCGACCGCCTGGGGAGTACGGCCG 880 
TCCTTGAGATCTTAGTGGCGCNNNTAACGCGATAAGTCGACCGCCTGGGGAGTACGGCCG 883 
TCCTTGAGATCTTAGTGGCGCANNNAACGCGATAAGTCGACCGCCTGGGGAGTACGGCCG 883 
TCCTTGAGATCTTAGTGGCGCAGNTAACGCGATAAGTCGACCGCCTGGGGAGTACGGCCG 883 
TCCTTGAGATCTTAGTGGCGAAGCTAACGCGATAAGTCGACCGCCTGGGGAGTACGGCCG 865 
TCCTTGAGATTTTAGTGGCGCASSTAACGCATTAAGTTGACCGCCTGGGGAGTACGGCCG 865 
TCCTTGAGACTTTAGTGGCGCNNNTAACGCACTAAGTTGACCGCCTGGGGAGTACGGTCG 866 
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CAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTA 927 
CAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTA 925 
CAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTA 904 
CAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTA 944 
CAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTA 940 
CAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTA 943 
CAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTA 943 
CAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTA 943 
CAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTA 925 
CAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTA 925 
CAAGATTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTA 926 
•*•*•*•* •*•*•*•**•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*****•**********•*•*•*•*•*•* 

ATTCGAAGCAACGCGAAGAACCTTACCAGGCCTTGACATCCAATGAATCCTTTAGAGATA 987 
ATTCGAAGCAACGCGAAGAACCTTACCAGGCCTTGACATCCAATGAACTTTCCAGAGATG 985 
AT T C GAAGC AAC GC GAAGAAC CTTACCAGGCCTTGACATCCAAT GAAT CTTCCAGAGATG 964 
ATTCGAAGCAACGCGAAGAACCTTACCTGGCCTTGACATGCTGAGAACTTTCCAGAGATG 1004 
ATTCGAAGCAACGCGAAGAACCTTACCAGGCCTTGACATGCAGAGAACTTTCCAGAGATG 1000 
ATTCGAAGCAACGCGAAGAACCTTACCTGGCCTTGACATGCTGAGAACTTTCCAGAGATG 1003 
ATTCGAAGCAACGCGAAGAACCTTACCTGGCCTTGACATGCTGAGAACTTTCCAGAGATG 1003 
ATTCGAAGCAACGCGAAGAACCTTACCTGGCCTTGACATGCTGAGAACTTTCCAGAGATG 1003 
ATTCGAAGCAACGCGAAGAACCTTACCTGGCCTTGACATGCTGAGAACTTTCCAGAGATG 985 
ATTCGAAGCAACGCGAAGAACCTTACCAGGCCTTGACATGCAGAGAACTTTCCAGAGATG 985 
ATTCGAAGCAACGCGAAGAACCTTACCTGGCCTTGACATGCAGAGAACCATCCAGAGATG 986 
*************************** *********** * *** * ****** 

GAGGAGTGCCTTCGGGAGCATTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGT 1047 
GATTGGTGCCTTCGGGAACATTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGT 1045 
GAGGAGTGCCTTCGGGAACATTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGT 1024 
GATTGGTGCCTTCGGGAACTCAGACACAGGTGCTGCATGGCTGTCG TCAGCTCGTGTCGT 1064 
GATTGGTGCCTTCGGGAACTCTGACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGT 1060 
GATTGGTGCCTTCGGGAACTCAGACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGT 1063 
GATTGGTGCCTTCGGGAACTCAGACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGT 1063 
GATTGGTGCCTTCGGGAACTCAGACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGT 1063 
GATTGGTGCCTTCGGGAACTCAGACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGT 1045 
GATTGGTGCCTTCGGGAACTCTGACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGT 1045 
GATGGGTGCCTTCGGGAACTCTGACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGT 1046 
** ************ * ** *********************************** 

GAGATGTTGGGTTAAGTCCCGTAACGAGCGCAACCCTTGTCCTTAGTTACCAGCACGTTA 1107 
GAGATGTTGGGTTAAGTCCCGTAACGAGCGCAACCCTTGTCCTTAGTTACCAGCACGTAA 1105 
GAGATGTTGGGTTAAGTCCCGTAACGAGCGCAACCCTTGTCCTTAGTTACCAGCACGTGA 1084 
GAGATGTTGGGTTAAGTCCCGTAACGAGCGCAACCCTTGTCCTTAGTTACCAGCACGTTA 1124 
GAGATGTTGGGTTAAGTCCCGTAACGAGCGCAACCCTTGTCATTAGTTACCAGCACGTTA 1120 
GAGATGTTGGGTTAAGTCCCGTAACGAGCGCAACCCTTGTCCTTAGTTACCAGCACCTCG 1123 
GAGATGTTGGGTTAAGTCCCGTAACGAGCGCAACCCTTGTCCTTAGTTACCAGCACCTCG 1123 
GAGATGTTGGGTTAAGTC CCGTAACGAGCGCAACCCTTGTCCTTAGTTACCAGCACCTCG 1123 
GAGATGTTGGGTTAAGTCCCGTAACGAGCGCAACCCTTGTCCTTAGTTACCAGCACGTTA 1105 
GAGATGTTGGGTTAAGTCCCGTAACGAGCGCAACCCTTGTCCTTAGTTACCAGCACGTTA 1105 
GAGATGTTGGGTTAAGTCCCGTAACGAG CGCAACCCTTGTCCCTAGTTACCAGCACTTCG 1106 
***************************************** ************* * 

AGGTGGGCACTCTAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAA 1167 
TGGTGGGCACTCTAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAA 1165 
TGGTGGGCACTCTAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAA 1144 
TGGTGGGCACTCTAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAA 1184 
AGGTGGGCACTCTAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAA 1180 
GG-TGGGCACTCTAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAA 1182 
GG-TGGGCACTCTAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAA 1182 
GG-TGGGCACTCTAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAA 1 182 
TGGTGGGCACTCTAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAA 1165 
TGGTGGGCACTCTAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAA 1165 
GG-TGGGCACTCTAGGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAA 1165 
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GTCATCATGGCCCTTACGGCCTGGGCTACACACGTGCTACAATGGTCGGTACAGAGGGTT 1227 
GTCATCATGGCCCTTACGGCCTGGGCTACACACGTGCTACAATGGTCGGTACAGAGGGTT 1225 
GTCATCATGGCCCTTACGGCCTGGGCTACACACGTGCTACAATGGTCGGTACAAAGGGTT 1204 
GTCATCATGGCCCTTACGGCCAGGGCTACACACGTGCTACAATGGTCGGTACAAAGGGTT 1244 
GTCATCATGGCCCTTACGGCCTGGGCTACACACGTGCTACAATGGTCGGTACAAAGGGTT 1240 
GTCATCATGGCCCT TACGGCCAGGGCTACACACGTGCTACAATGGTCGGTACAAAGGGTT 1242 
GTCATCATGGCCCTTACGGCCAGGGCTACACACGTGCTACAATGGTCGGTACAAAGGGTT 1242 
GTCATCATGGCCCTTACGGCCAGGGCTACACACGTGCTACAATGGTCGGTACAAAGGGTT 1242 
GTCATCATGGCCCTTACGGCCAGGGCTACACACGTGCTACAATGGTCGGTACAAAGGGTT 1225 
GTCATCATGGCCCTTACGGCCTGGGCTACACACGTGCTACAATGGTCGGTACAGAGGGTT 1225 
GTCATCATGGCCCTTACGGCCAGGGCTACACACGTGCTACAATGGGGGATACAAAGGGTT 1225 



•*•*•*•*•*•*•*•*•**•**•**•***•*•**•* *•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•*•**•* * *•*** *•*•*•*•*•* 
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GCCAAACCGCGAGGTGGAGCTAATCTCACAAAACCGATCGTAGTCCGGATCGCAGTCTGC 12 87 
GCCAAGCCGCGAGGTGGAGCTAATCCCACAAAACCGATCGTAGTCCGGATCGCAGTCTGC 1285 
GCCAANCCGCGAGGTGGAGCTAATCCCATAAAACCGATCGTAGTCCGGATCGCAGTCTGC 12 64 
GCCAAGCCGCGAGGTGGAGCTAATCCCATAAAACCGATCGTAGTCCGGATCGCAGTCTGC 1304 
GCCAAGCCGCGAGGTGGAGCTAATCCCATAAAACCGATCGTAGTCCGGATCGCAGTCTGC 1300 
GCCAAGCCGCGAGGTGGAGCTAATCCCATAAAACCGATCGTAGTCCGGATCGCAGTCTGC 1302 
GCCAAGCCGCGAGGTGGAGCTAATCCCATAAAACCGATCGTAGTCCGGATCGCAGTCTGC 1302 
GCCAAGCCGCGAGGTGGAGCTAATCCCATAAAACCGATCGTAGTCCGGATCGCAGTCTGC 1302 
GCCAAGCCGCGAGGTGGAGCTAATCCCATAAAACCGATCGTAGTCCGGATCGCAGTCTGC 12 85 
GCCAAGCCGCGAGGTGGAGCTAATCTCACAAAACCGATCGTAGTCCGGATCGCAGTCTGC 12 85 
GCCAAGCCGCGAGGTGGAGCTAATCCCATAAAGTCTCTCGTAGTCCGGATTGGAGTCTGC 1285 



***** ******************* ** *** * 



************* * ******* 
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AACTCGACTGCGTGAAGTCGGAATCGCTAGTAATCGCGAATCAGAATGTCGCGGTGAATA 1347 
AACTCGACTGCGTGAAGTCGGAATCGCTAGTAATCGCGAATCAGAATGTCGCGGTGAATA 1345 
AACTCGACTGCGTGAAGTCGGAATCGCTAGTAATCGCGAATCAGAATGTCGCGGTGAATA 1324 
AACTCGACTGCGTGAAGTCGGAATCGCTGGTAATCGTGAATCAGAATGTCACGGTGAATA 1364 
AACTCGACTGCGTGAAGTCGGAATCGCTAGTAATCGTGAATCAGAATGTCACGGTGAATA 1360 
AACTCGACTGCGTGAAGTCGGAATCGCTAGTAATCGTGAATCAGAATGTCACGGTGAATA 1362 
AACTCGACTGCGTGAAGTCGGAATCGCTAGTAATCGTGAATCAGAATGTCACGGTGAATA 1362 
AACTCGACTGCGTGAAGTCGGAATCGCTAGTAATCGTGAATCAGAATGTCACGGTGAATA 1362 
AACTCGACTGCGTGAAGTCGGAATCGCTAGTAATCGTGAATCAGAATGTCACGGTGAATA 1345 
AACTCGACTGCGTGAAGTCGGAATCGCTAGTAATCGCGAATCAGAATGTCGCGGTGAATA 1345 
AACTCGACTCCATGAAGTCGGAATCGCTAGTAATCGTGGATCAGAACGCCACGGTGAATA 1345 



********* * **************** ******* * ******* * * ********* 
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CGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTG GGTTGCACCAGAAGTA 1407 
CGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCACCAGAAGTA 1405 
CGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCACCAGAAGTA 1384 

CGTTCCC 1371 

CGTTCCCGGGCTTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCTCCAGAAGTA 1420 
CGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCTCCAGAAGTA 1422 
CGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCTCCAGAAGTA 1422 
CGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCTCCAGAAGTA 1422 
CGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCTCCAGAAGTA 1405 
CGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCACCAGAAGTA 1405 
CGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCACCAGAAGTA 1405 



14.3.10.4. Preparation 
of Electrocompetent 
DH10B Cells 



1. Pick a colony of DHIOB and inoculate into a 5 ml of LB 
broth. Grow overnight at 37 °C with shaldng. 

2. Next morning, add 1 % of the overnight culture into 500 ml of 
LB medium and incubate at 37 °C with shaldng until OD 600 
reaches -0.7 (this takes -2 h). 

3. Cool cells in cold room on ice for -20 min. 
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14 . 3 . 10 . 5 . 

Electrotransformation 



4. Centrifuge the rotor without tubes for 5 min to precool 
to 4 °C. 

5. Pour cells into 2 precooled centrifuge bottles, 250 ml each. 
Spin at 5,000 rpm for 15 min at 4 °C. 

6. Decant supernatant and resuspend cells in 500 ml of sterile 
ice-cold water. 

7. Spin at 5,000 rpm for 15 min at 4 °C. Decant supernatant and 
resuspend cells in 250 ml of sterile ice-cold water. 

8. Centrifuge at 5,000 rpm for 15 min at 4 °C. Decant superna 
tant and resuspend cells in 125 ml of sterile ice-cold water. 

9. Centrifuge at 5,000 rpm for 15 min at 4 °C. Decant and 
resuspend cells in 10 ml ice-cold 10 % glycerol. Transfer cells 
into a 50 ml centrifuge tube and centrifuge on tabletop 
centrifuge, at 13, 000 rpm for 15 min at 4 °C. Decant 
supernatant and resuspend cells in 0.5-1 ml sterile 10 % 
glycerol. 

10. Aliquot 100 pi of cells into each precooled microcentrifuge 
tubes on ice and store electrocompetent cells at — 80 °C 
until use. 

1. Prior to electroporation, ligation mixture must be precipi- 
tated with ethanol or diluted to prevent the samples from 
causing an arc to jump across the cuvette upon application 
of the pulse. 

2. Thaw an aliquot of E. coli DHIOB cells on ice. When cells are 
thawed, add 1-10 pi of ligation mixture to the cells and mix 
by tapping gently. 

3. Carefully pipette the cell/DNA mixture into a chilled 0.1 cm 
cuvette. Gently tap the cuvette to ensure that the cell/DNA 
mixture maltes contact all the way across the bottom of the 
cuvette chamber. Avoid formation of bubbles. 

4. Wipe the outside of the cuvette with a tissue to dry it, place it 
in the electroporation chamber, and apply pulse. For BioRad 
GenePulser® 11 electroporator, the recommended pulse con- 
ditions are 2.0 kV, 200 Q, and 25 pF. 

5. Immediately after pulsing, add 900 pi of SOC medium and 
transfer the solution to a microcentrifiige tube. Delaying this 
transfer can seriously reduce the survival of transformed cells. 

6. Incubate at 225 rpm (37 °C) for 1 h with shaldng. 

7. Spread the cells on LB agar plates containing ampicillin 
(100 pg/ml), X-gal (20 pg/ml), and IPTG (40 pg/ml). 

8. Incubate plates overnight at 37 °C, to permit the color to 
develop sufficiently to distinguish blue colonies from white. 
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Bioinformatics Toois for interpretation of Data Used 
in Moiecuiar identification 

Suchi Smita, Krishna P. Singh, Bashir A. Akhoon, Shishir K. Gupta, 
and Shaiiendra K. Gupta 

Abstract 

The advancement in the high-throughput instrumentation techniques for genome sequences resulted in 
massive amount of data, which are publically available in biological databases. Availability of these large 
numbers of completely sequenced prokaryotic and eukaryotic genomes changed the way the biological 
and biomedical experiments were conducted in the past. Varieties of tools were generated to help 
researchers to extract the information and messages encoded in these sequences. The current chapter 
highlights some of the basic bioinformatics protocols to use various tools and utilities for molecular data 
interpretation. 



15.1 Introduction 



The amount of sequence data generated through high-through- 
put techniques is significantly outstripping the storage capacity 
available; hence it is the need to store the data in a particular 
format that require less storage space, be easily understandable, 
and quickly accessible. Biological sequences are encoded in 
specific formats based on the sequencing methodology involved; 
also various utilities/applications understand different sequence 
formats. Thus, for successful job submission, it is important 
to understand the variety of sequence formats used for describing 
biological sequences and the procedure to change one sequence 
format into another using various sequence format converting 
utilities. 

Sequence format is the representation in which nucleotide 
and protein sequences are stored in the computers. All sequence 
formats are standard ASCII files that differ in the way they 
hold sequence data and other related information about sequence, 
such as sequence ID, organism name, sequence title, date of 
submission, function, and comments. Some of the frequently 
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>gi I 3372530 I gb I AF052295 . 1 ! Bradyrhizobium japonicum ferric uptake 
regulator (fur) gene, complete cds 

GGGCCCATCGCGGCCGCGGCCTCTCCCACACGCTGCTGATGAGCCATCTCGGCCACCTCGCCGGGCGCGG 

CGTGCGCACGATATTTCTCGAGGTCGAGGAAAACAACCAGCCGGCACGGCGGCTCTACGCGAGGTGCGGA 

TTCATGGTGGTCGGCCGCCGCGAACGCTACTATA7\ACAGCCGAACGGGGAACAATTGAACGCCCTTCTGA 

TGCGGCGTGACTTGTCGTAACATTGATGGCAGAAAGCGCCCCGTCAGGCAGACAGATCATGACCGCACTG 

AAACCTTCTTCTGCATCCAAGGCGTCCGGCATCGAGGCGCGCTGTGCCGCCACCGGCATGCGCATGACCG 

AGCAGCGCCGCGTCATCGCCCGCGTGCTCGCGGAGGCGGTCGATCATCCCGACGTCGAGGAATTATACCG 

CCGCTGCGTCGCCGTCGACGACAAGATCTCGATCTCAACCGTGTATCGCACCGTCAAGCTGTTCGAGGAT 

GCCGGCATCATCGAACGCCATGACTTTCGCGAGGGACGCGCGCGCTACGAGACGATGCGCGACAGCCATC 

ACGACCACCTCATCAATCTGCGCGACGGCAAGGTGATCGAGTTCACCTCCGAAGAGATCGAGAAGCTCCA 

GGCGGAGATCGCCCGCAAGCTCGGCTACAAGCTGGTCGACCACCGGCTCGAGCTCTATTGCGTCCCGCTC 

GACGACGACAAGCCCACAAGCTAAGTGCCCGTCGATCTCATCATCTTCGACTGCGATGGCGTGCTCGTGG 

ACAGCGAGGTGATCTCCTGTCGCGCGCATGCGGATGTGCTGACCCGCCACGGCTATCCGATC 



Fig. 15.1 Sample FASTA file containing ferric uptake regulator gene from Nostoc azollae. Note the “>” symbol in the 
beginning of entry, followed by title of the sequence. Each line of the sequence contains <80 characters. 



15.1.1. FASTA 
(Pearson) Format 



15.1.2. GenBank 
Format 



encountered sequence formats while handling various bioinfor- 
matics tools and software are: 

This is the most general format to represent DNA, RNA, and 
protein sequences. Sequence is written in standard lUPAC single- 
letter codes for nucleotide and protein sequences. The first line of 
each sequence begins with a greater- than sign (>), followed by 
single line header containing sequence information. Rest of the 
lines contains sequence itself. It is recommended to store <80 
characters per line. Depending on the application, blank lines in 
FASTA files can be ignored or considered as sequence termina- 
tion. Also, the spaces and nonsequence symbols are either ignored 
or treated as gaps as shown in Fig. 15.1. FASTA files may contain 
multiple sequences with one sequence listed right after another. 
The multi-FASTA format is accepted by most of the multiple 
sequence alignment tools. 

GenBank is an annotated collection of all publically available DNA 
sequences managed by National Institute of Health, MD, USA. 
GenBank file provides information related to gene and gene prod- 
uct (e.g., sequence, length, definition, keywords, source organ- 
ism, references, etc.). GenBank format (GenBank Flat File 
Format) consists of an annotation section and a sequence section. 
The start of the annotation section is marked by a line beginning 
with the word “LOCUS.” The start of sequence section is marked 
by a line beginning with the word “ORIGIN” and the end of the 
section is marked by a line with only “//” as shown in Fig. 15.2. 
Some of the commonly used fields in the GenBank records are 
summarized in Table 15.1. 
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LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 

AUTHORS 

TITLE 

JOURNAL 

PUBMED 

REFERENCE 

AUTHORS 

TITLE 

JOURNAL 

PUBMED 

REFERENCE 

AUTHORS 

TITLE 

JOURNAL 



FEATURES 

source 



gene 

CDS 



ORIGIN 



AF052295 832 bp DNA linear BCT 20-SEP-1999 

Bradyrhizobium japonicum ferric uptake regulator (fur) gene, 
complete cds . 

AF052295 

AF052295.1 GI:3372530 

Bradyrhizobium j aponicum 
Bradyrhizobium japonicum 

Bacteria; Proteobacteria; Alphaproteobacteria; Rhiz obi ales ; 
Bradyrhizobiaceae; Bradyrhizobium. 

1 (bases 1 to 832) 

Hamza,!., Chauhan,S., Hassett,R. and O'Brian, M.R. 

The bacterial irr protein is required for coordination of heme 
biosynthesis with iron availability 
J. Biol. Chem. 273 (34), 21669-21674 (1998) 

9705301 

2 (bases 1 to 832) 

Hamza,!., Hassett,R. and O'Brian, M.R. 

Identification of a functional fur gene in Bradyrhizobium japonicum 
J. Bacteriol. 181 (18), 5843-5846 (1999) 

10482529 

3 (bases 1 to 832) 

O'Brian, M.R. 

Direct Submission 

Submitted (06-MAR-1998) Biochemistry, State University of New York 
at Buffalo, 140 Farber Hall, 3435 Main Street, Buffalo, NY 14214, 
USA 

Location /Qualifiers 
1 . . 832 

/organism=" Bradyrhizobium j aponicum" 

/mo l_type=" genomic DNA" 

/strain^" strain 1110" 

/db_xref=" taxon : 375" 

269. .724 
/gene="fur" 

269. .724 
/gene="fur" 

/function=" transcriptional regulator" 

/note="FUR" 

/codon_start=l 

/transl_table=ll 

/product=" ferric uptake regulator" 

/protein_id^"AAC32180 . 1" 

/db_xref="GI : 3372531" 

/translation-"MTALKPSSASKASGIEARCAATGMRMTEQRRVIARVLAEAVDHP 

DVEELYRRCVAVDDKISISTVYRTVKLFEDAGIIERHDFREGRARYETMRDSHHDHLI 

NLRDGKVIEFTSEEIEKLQAEIARKLGYKLVDHRLELYCVPLDDDKPTS" 



// 



1 gggcccatcg cggccgcggc 
61 ccgggcgcgg cgtgcgcacg 
121 ggctctacgc gaggtgcgga 
181 cgaacgggga acaattgaac 
241 agaaagcgcc ccgtcaggca 
301 ggcgtccggc atcgaggcgc 
361 cgtcatcgcc cgcgtgctcg 
421 ccgctgcgtc gccgtcgacg 
481 gttcgaggat gccggcatca 
541 gacgatgcgc gacagccatc 
601 gttcacctcc gaagagatcg 
661 gctggtcgac caccggctcg 
721 ctaagtgccc gtcgatctca 
781 gatctcctgt cgcgcgcatg 



ctctcccaca 

atatttctcg 

ttcatggtgg 

gcccttctga 

gacagatcat 

gctgtgccgc 

cggaggcggt 

acaagatctc 

tcgaacgcca 

acgaccacct 

agaagctcca 

agctctattg 

tcatcttcga 

cggatgtgct 



cgctgctgat 

aggtcgagga 

tcggccgccg 

tgcggcgtga 

gaccgcactg 

caccggcatg 

cgatcatccc 

gatctcaacc 

tgactttcgc 

catcaatctg 

ggcggagatc 

cgtcccgctc 

ctgcgatggc 

gacccgccac 



gagccatctc ggccacctcg 
aaacaaccag ccggcacggc 
cgaacgctac tataaacagc 
cttgtcgtaa cattgatggc 
aaaccttctt ctgcatccaa 
cgcatgaccg agcagcgccg 
gacgtcgagg aattataccg 
gtgtatcgca ccgtcaagct 
gagggacgcg cgcgctacga 
cgcgacggca aggtgatcga 
gcccgcaagc tcggctacaa 
gacgacgaca agcccacaag 
gtgctcgtgg acagcgaggt 
ggctatccga tc 



Fig. 15.2 Sample GenBank file format containing ferric uptake regulator gene from Nostoc azollae. Note the data 
available in locus and feature section of the file. 
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Table 15.1 

A summary of fields commonly found In GenBank record 



Field 


Description 


LOCUS 


Contain unique locus name, often the first letter of genus and species followed by 
accession number; sequence length; type of the sequence (e.g., Genomic DNA, 
genomic RNA, mRNA, rRNA, tRNA, cytoplasmic RNA, etc.); Molecular topology 
(linear/ circular); GenBank division and date the file was last revised 


DEFINITION 


Brief description of the sequence including organism name and the gene/protein 
name 


ACCESSION 


Unique sequence identifier 


VERSION 


Version of the entry. Allow users to track multiple incarnation of a given sequence 


KEYWORDS 


Various keywords describing the sequence 


SOURCE 


Organism name followed by detail classification from NCBI taxonomy database 


REFERENCES 


Publications by the authors for the GenBank entry 


FEATURES 


A concise summary of the gene/protein annotation along with various biologically 
important regions in the sequence 


ORIGIN 


Beginning of the sequence data 



15.1.3. EMBL Format In EMBL format, each entry in the database is composed of lines 

represented by specific identifier to record various types of data, 
which make up the entry. All entries begin with the line identifier 
and end with a terminator line //as shown in Fig. 15.3. Some of 
the commonly used line identifiers in the EMBL format and their 
detailed description are listed in Table 15.2. 

15.1.4. MSF File Multiple sequence files (MSF) are used in various software pro- 

Format grams for sequence analysis package. As with PHYLIP program 

for phylogenetic analysis, the use of multiple sequence alignment 
data is required as input in the MSF format. A sample file is shown 
in Fig. 15.4 where ten fur proteins are aligned. The file begins 
with either PileUP, !!NA_MULTIPLE_ALIGNMENT, or !! 
AA_MULTIPLE_ ALIGNMENT. The second line begins with 
MSF: <Length of the alignment> (here it is 191 amino acids), 
followed by type of alignment (N: Nucleotide, P: Protein), a 
checksum number, and two dots that indicate the end of the 
header. Next block contains information of sequences, their 
name, length, a checksum, and weight. 



15.1.5. PHYLIP File The Phylogeny Inference Package (PHYLIP) is widely used to 

Format infer phylogenies, generating evolutionary trees and distance 

matrices for both nucleotide and protein sequences. All the 
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ID 

XX 


AF052295 standard; linear DNA; BCT; 832 BP. 




DT 

XX 


20-SEP-1999 




DE 


Bradyrhizobium japonicura ferric uptake regulator (fur) gene. 




DE 

XX 


complete cds . 




AC 

XX 


AF052295; 




SV 

XX 


AF052295.1 01:3372530 




KW 

XX 






OS 


Bradyrhizobium japonicum 




oc 


Bradyrhizobium japonicum 




oc 


Bacteria; Proteobacteria; Alphaproteobacteria; Rhizobiales ; 




oc 

XX 


Bradyrhizobiaceae; Bradyrhizobium. 




RN 


[1] 




RP 


1-832 




RA 


Hamza,!., Chauhan,S., Hassett,R. and 0Brian,M.R. 




RT 


"The bacterial irr protein is required for coordination of heme 




RT 


biosynthesis with iron availability"; 




RL 

XX 


J. Biol. Chem. 273 (34), 21669-21674 (1998) 




RN 


[2] 




RP 


1-832 




RA 


Hamza,!., Hassett,R. and 0Brian,M.R. 




RT 


"Identification of a functional fur gene in Bradyrhizobium japonicum"; 




RL 

XX 


J. Bacterid. 181 (18), 5843-5846 (1999) 




RN 


[3] 




RP 


1-832 




RA 


OBrian,M.R. 




RT 


"Direct Submission" ; 




RL 


Submitted (06-MAR-1998) Biochemistry, State University of New York 




RL 


at Buffalo, 140 Farber Hall, 3435 Main Street, Buffalo, NY 14214, 




RL 

XX 


USA 




FH 


Key Location/Qualifiers 




FT 


source 1 . . 832 




FT 


/organism="Bradyrhizobium j aponicum" 




FT 


/mol type="genomic DNA" 




FT 


/strain=" strain 1110" 




FT 


/db xref="taxon : 375" 




FT 


gene 269 . . 724 




FT 


/gene=" fur" 




FT 


CDS 269. .724 




FT 


/gene=" fur" 




FT 


/function=" transcriptional regulator" 




FT 


/note="FUR" 




FT 


/codon start=l 




FT 


/transl table=ll 




FT 


/product=" ferric uptake regulator" 




FT 


/protein id="AAC32180 . 1 " 




FT 


/db xref="GI :3372531" 




FT 


/translation="MTALKPSSASKASGIEARCAATGMRMTEQRRVIARVLAEAVDHPD 


FT 


VEELYRRCVAVDDKISISTVYRTVKLFEDAGIIERHDFREGRARYETMRDSHHDHLINL 


FT 


RDGKVIEFTSEEIEKLQAEIARKLGYKLVDHRLELYCVPLDDDKPTS" 




SQ 


Sequence 832 BP; 






gggcccatcg cggccgcggc ctctcccaca cgctgctgat gagccatctc ggccacctcg 


60 




ccgggcgcgg cgtgcgcacg atatttctcg aggtcgagga aaacaaccag ccggcacggc 


120 




ggctctacgc gaggtgcgga ttcatggtgg tcggccgccg cgaacgctac tataaacagc 


180 




cgaacgggga acaattgaac gcccttctga tgcggcgtga cttgtcgtaa cattgatggc 


240 




agaaagcgcc ccgtcaggca gacagatcat gaccgcactg aaaccttctt ctgcatccaa 


300 




ggcgtccggc atcgaggcgc gctgtgccgc caccggcatg cgcatgaccg agcagcgccg 


360 




cgtcatcgcc cgcgtgctcg cggaggcggt cgatcatccc gacgtcgagg aattataccg 


420 




ccgctgcgtc gccgtcgacg acaagatctc gatctcaacc gtgtatcgca ccgtcaagct 


480 




gttcgaggat gccggcatca tcgaacgcca tgactttcgc gagggacgcg cgcgctacga 


540 




gacgatgcgc gacagccatc acgaccacct catcaatctg cgcgacggca aggtgatcga 


600 




gttcacctcc gaagagatcg agaagctcca ggcggagatc gcccgcaagc tcggctacaa 


660 




gctggtcgac caccggctcg agctctattg cgtcccgctc gacgacgaca agcccacaag 


720 




ctaagtgccc gtcgatctca tcatcttcga ctgcgatggc gtgctcgtgg acagcgaggt 


780 


// 


gatctcctgt cgcgcgcatg cggatgtgct gacccgccac ggctatccga tc 


832 



Fig. 15.3 Sample EMBL file format containing ferric uptake regulator gene from Nostoc azollae. Note the line identifier 
before each line. 
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Table 15.2 

A summary of fields commonly found In EMBL record 



Line 

identifier 


Description 


ID 


Identification line. Always the first line of any EMBL entry which include sequence id, 
sequence version, topology, type of molecule, dataciass, division, and sequence length 


XX 


No data or comments 


AC 


Accession number 


DT 


Date of data submission and last modification 


DE 


Data description contains general information about the sequence stored 


SV 


Sequence version information along with the gi (geninfo identifier) number 


KW 


Keyword line that provides information to generate cross-reference indexes of the 
sequence based on structural and functional criteria 


OS 


Organism species from which the sequence is derived 


oc 


Organism taxonomic classification 


RN 


Reference number, a unique number to each reference within the entry 


RP 


Reference position, a optional line type, which appears if one or more contiguous base 
spans of the presented sequence are listed in the reference 


RA 


Reference author name 


RL 


Reference location line contains information about journal, year of publication, volume, 
and page number 


RT 


Reference title 


RX 


Reference cross reference 


DR 


Database cross reference 


CC 


Comments about the entry 


FH 


Eeature header 


FT 


Eeature table contains summary of structural and fimctional regions within the sequence 
and provides cross linkages to other databases 


SQ 


Sequence header contains brief information about the sequence followed by sequence 
data in 5'-3'. Each line of the sequence data is composed of 60 bases grouped into a 
block of ten bases. Each block in the line is separated by space 


// 


Terminator of the entry 



programs available in the PHYLIP package require sequences to 
be formatted in PHYLIP’s own format. A sample PHYLIP file 
format is shown in Fig. 15.5. The file format is almost similar to 
the MSF file format. The first line of the file contains two num- 
bers, i.e., number of sequences and the length of the alignment 
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PileUp 



MSF; 191 Type: P 



Check: 2949 



Name : 


gi 


Name : 


gi 


Name : 


gi 


Name : 


gi 


Name : 


gi 


Name : 


gi 


Name : 


gi 


Name : 


gi 


Name : 


gi 


Name : 


gi 



I 295687474 
I 325059641 
I 27375908 | 

I 51588753 | 

1 115348324 
I 53720551 1 
I 40388402 | 

I 484790861 
I 271962470 
1298489843 



ref I YP_0035911 67 . oo Len 
gb I ADY63332 . 1 I oo Len: 
ref I NP_767437 . 1 I oo Len: 
emb I CAH20364 . 1 1 oo Len: 
emb I CAL21256 . 1 I oo Len: 
ref I YP_109537 . 1 I oo Len: 
gb I AAR85472 . 1 I oo Len: 1 



gb I AAT448 65 . 1 I oo Len: 1 

ref I YP_003336666 . oo Len 
ref |YP_003720020. oo Len 



191 Check: 9112 Weight: 9.C 

191 Check: 6574 Weight: 9.0 

191 Check: 9039 Weight: 10.0 

191 Check: 6442 Weight: 6.0 

191 Check: 6330 Weight: 6.0 

191 Check: 1545 Weight: 11.0 

91 Check: 4326 Weight: 10.0 

91 Check: 1301 Weight: 16.0 

191 Check: 8379 Weight: 18. 

191 Check: 9901 Weight: 15, 



// 



295687474 
325059641 
27375908 | 
51588753 | 
115348324 
53720551 1 
40388402 | 
484790861 
271962470 
298489843 



ref I YP_003591167 . 
gb|ADY63332.1| 
ref |NP_767437.1| 
emb I CAH20364 . 1 | 
emb I CAL21256. 1 | 
ref |YP_109537.1| 
gb I AAR85472 . 1 | 
gb IAAT44865 . 1 | 
ref I YP_003336666 . 
reflYP 003720020. 



MDRLEKA 

M IDLSKTLEEL 

..MTALKPSS ASKASGIEAR 

MTDNNKA 

MTDNNKA 

MTNPTD 

MIDERMNSDE 

MS AYTASSLKAE 

MTETSWHEQ 

MQKPAISTKA ISSLEDALHR 



CIEKGMRMTD 

CAERGMRMTD 

CAATGMRMTE 

LKNAGLKVTL 

LKNAGLKVTL 

LKNIGLKATL 

LKRAGLKATL 

LNARGWRLTP 

LRARGYRVTP 

CQMLGMRVSR 



QRRVIARVLS 

QRRVIARVLQ 

QRRVIARVLA 

PRLKILEVLQ 

PRLKILEVLQ 

PRLKILEIFQ 

PRLKILRIFE 

QREKILHVFQ 

QRQLVLEAVK 

QRRFILELLW 



SA. .EDHPDV 
ES. .ADHPDV 
EA. .VDHPDV 
NPA.CHHVSA 
NPA.CHHVSA 
QSP.VRHLTA 
DSD.ARHLTA 
NLPKGNHLSA 
AA. . .EHATP 
QAN. .EHLSA 



295687474 1 ref | YP_0 035 91167 . 
325059641 | gb I ADY63332 . 1 | 
27375908 | ref | NP_767437 . 1 | 
515887531 emb | CAH20364 . 1 | 
115348324 | emb | CAL21256. 1 | 
53720551 1 ref I YP 109537. 1| 



EELHRRAHAI 

EELYRRSSAV 

EELYRRCVAV 

EDLYKKLIDI 

EDLYKILIDI 

EDVYRNLLHE 



DPHISIATVY 

DPRISISTVY 

DDKISISTVY 

GEEIGLATVY 

GEEIGLATVY 

ELDIGLATVY 



RTVRLFEESG 

RTVKLFEDAG 

RTVKLFEDAG 

RVLNQFDDAG 

RVLNQFDDAG 

RVLTQFEQAG 



IIERHDFRDG 

IIERHDFRDG 

IIERHDFREG 

IVTRHNFEGG 

IVTRHNFEGG 

LLSRSNFESG 



RSRYEETP. 
RSRYETVP. 
RARYETMR. 
KSVFELTQ. 
KSVFELTQ. 
KAVFELNE . 



Fig. 15.4 Sample MSF file format containing ten ferric uptake regulator proteins aligned from different microorganisms. 



followed by the sequence data. The sequence name is limited to 
ten characters only. 

15.1.6. ALN/ClustalW2 ALN/ClustalW2 file format is generated during multiple 

Format sequence alignment by ClustalW/ClustalX software. The format 

is widely accepted by various software suits analyzing multiple 
sequences in order to investigate structural and functional rela- 
tionships. The ALN/ClustalW2 file begins with the software 
version of Clustal through which the file is generated followed 
by the aligned sequences in a block of 60 residues or nucleotides 
per line. Information about the residues alignment is written in 
the last line by three special characters. The character “*” indicates 
that all the nucleotides/residues are identical in all the sequences 
in the alignment; means conserved substitutions have been 
observed, while represents semi- conserved substitutions at 
that site. A sample file is shown in Fig. 15.6. 

Often, the bioinformatics analysis on a given set of data 
requires integration of third party independent software. Most 
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10 


191 










gi 


2956874 




MDRLEKA 


CIEKGMRMTD 


QRRVIARVLS 


SA — EDHPDV 


gi 


3250596 


M 


IDLSKTLEEL 


CAERGMRMTD 


QRRVIARVLQ 


ES — ADHPDV 


gi 


2737590 


--MTALKPSS 


ASKASGIEAR 


CAATGMRMTE 


QRRVIARVLA 


EA — VDHPDV 


gi 


5158875 




MTDNNKA 


LKNAGLKVTL 


PRLKILEVLQ 


NPA-CHHVSA 


gi 


1153483 




MTDNNKA 


LKNAGLKVTL 


PRLKILEVLQ 


NPA-CHHVSA 


gi 


5372055 




MTNPTD 


LKNIGLKATL 


PRLKILEIFQ 


QSP-VRHLTA 


gi 


4038840 




MIDERMNSDE 


LKRAGLKATL 


PRLKILRIFE 


DSD-ARHLTA 


gi 


4847908 


MS 


AYTASSLKAE 


LNARGWRLTP 


QREKILHVFQ 


NLPKGNHLSA 


gi 


2719624 




-MTETSWHEQ 


LRARGYRVTP 


QRQLVLEAVK 


AA EHATP 


gi 


2984898 


MQKPAISTKA 


ISSLEDALHR 


CQMLGMRVSR 


QRRFILELLW 


QAN — EHLSA 






EELHRRAHAI 


DPHISIATVY 


RTVRLFEESG 


IIERHDFRDG 


RSRYEETP — 






EELYRRSSAV 


DPRISISTVY 


RTVKLFEDAG 


IIERHDFRDG 


RSRYETVP-- 






EELYRRCVAV 


DDKISISTVY 


RTVKLFEDAG 


IIERHDFREG 


RARYETMR-- 






EDLYKKLIDI 


GEEIGLATVY 


RVLNQFDDAG 


IVTRHNFEGG 


KSVFELTQ-- 






EDLYKILIDI 


GEEIGLATVY 


RVLNQFDDAG 


IVTRHNFEGG 


KSVFELTQ-- 






EDVYRNLLHE 


ELDIGLATVY 


RVLTQFEQAG 


LLSRSNFESG 


KAVFELNE — 






GEIYRLLLET 


GEEVGLATVY 


RVLTQFEMAG 


LVRRHHFEGD 


KAVFELNE — 






EELQELLDKR 


GEGISLSTIY 


RSVKLMSRMG 


ILRELELAEG 


HKHYELNQPY 






EEICARVRET 


ARGVNISTVY 


RTLELLEELG 


MVTHTHLGHG 


APTYHLAA — 






REIYDRLNQQ 


GKEIGHTSVY 


QNLEALSSQG 


IIECIEHCDG 


RLYGNIND — 






THHHDHLIDM 


KTGKVVEFVD 


EEIEALQNAI 


ARKLGYKLVD 


HRLELYGVPL 






EEHHDHLIDL 


KNSVVIEFHS 


PEIEALQEKI 


AREHGFKLVD 


HRLELYGVPL 






DSHHDHLINL 


RDGKVIEFTS 


EEIEKLQAEI 


ARKLGYKLVD 


HRLELYGVPL 






QHHHDHLICL 


DCGKVIEFSN 


ESIESLQREI 


AKQHGIKLTN 


HSLYLYGHCE 






QHHHDHLICL 


DCGKVIEFSN 


ESIESLQREI 


AKQHGIKLTN 


HSLYLYGHCE 






GSHHDHLVCL 


DCGRVEEFFD 


AEIETRQQSI 


AKERGFKLQE 


HSLAMYGTCT 






TGHHDHMVCT 


ACGKVLEFFD 


EMLEARQREL 


AANRGFFISD 


HSLYLYGTCL 






PHHHHHLVCI 


QCNKTIEFNN 


DSILKHSLKQ 


CEKEGFQLID 


CQLTVMAICP 






DSDHVHLVCH 


ECGEINEARP 


EWQEFVTKL 


DEELGFAIDV 


HHLTVFGRCR 






--AHSHVNCV 


DTNQILDVHI 


ELPAELIQQV 


EAQTGVKIIA 


YTINFFGHRN 






EE 








- 






KPG 


-EH 






- 






DDD 


-KPTS 






- 






TGN C 


REDESAHSKR 






- 






TGN C 


REDESAHSKR 






- 






TEN CPY 


RKH 






- 






GMQDVGICSL 


RDDDAPGAST 


D 




- 






EALRMGWPSG 


IPSNWGCTRS 


LVDTRFQNCE 


IPESKEPEPE 


N 






NCR 

s 








- 



Fig. 15.5 Sample PHYLIP file format with aligned ferric uptake regulator proteins from ten different microorganisms. 



of the time, the output generated by one program cannot be used 
directly as the input to another program. The format of the file 
need to be converted before second program can actually read the 
input file. There are several utilities available on the Web that 
automatically converts one file format into another. Some of 
these utilities are listed in Table 15.3. 



15.2 Materials 




15.2.1. Converting 


1 . 


Sequence Format 




Using SeqVerter 


2 . 



Hardware: Computer system with at least Pentium 450 MHz 
processor, 50 MB hard disk space, 512 MN RAM. 

Software: Window 7, Vista, XP, 200, Tinux, Mac OS. SeqVer- 
ter sequence format conversion utility 
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CLUSTAL X (1.81) multiple sequence alignment 



gi 

gi 

gi 

gi 

gi 

gi 

gi 

gi 

gi 

gi 



295687474 | ref | YP_003591167 . 
325059641 Igbl ADY63332 .1 | 
27375908 Iref |NP_767437.1 I 
51588753 | emb | CAH20364 . 1 | 
115348324 | emb | CAL21256. 1 | 
5372 0551 1 ref | YP_10 9537 . 1 1 
403884021 gb | AAR85472 . 1 | 
484790861 gb I AAT44865 . 1 | 
271962470 | ref | YP_003336666 . 
2 9848 9843 1 ref | YP_00372 002 0 . 



MDRLEKACIEKGMRMTDQRRVIARVLSSA — EDHPDV 

MIDLSKTLEELCAERGMRMTDQRRVIARVLQES — ADHPDV 

— MTALKPSSASKASGIEARCAATGMRMTEQRRVIARVLAEA — VDHPDV 

MTDNNKALKNAGLKVTLPRLKILEVLQNPA-CHHVSA 

MTDNNKALKNAGLKVTLPRLKILEVLQNPA-CHHVSA 

MTNPTDLKNIGLKATLPRLKILEIFQQSP-VRHLTA 

MIDERMNSDELKRAGLKATLPRLKILRIFEDSD-ARHLTA 

MSAYTASSLKAELNARGWRLTPQREKILHVFQNLPKGNHLSA 

MTETSWHEQLRARGYRVTPQRQLVLEAVfCAA EHATP 

MQKPAISTKAISSLEDALHRCQMLGMRVSRQRRFILELLWQAN — EHLSA 



gi 1295687474 I ref | YP_003591167 . 
gi 1325059641 Igbl ADY63332 .1 | 
gi|27375908|ref |NP_767437.1| 
gi I 51588753 | emb | CAH20364 . 1 | 
gi 1115348324 | emb | CAL21256. 1 | 
gi I 53 72 0 551 I ref | YP_10 95 37 . 1 | 
gi I 40388402 I gb IAAR85472 . 1 | 
gi I 4847908 6 I gb | AAT4 4 8 65 . 1 | 
gi 1271962470 | ref | YP_003336666 . 
gi 12 98 4898431 ref I YP_003 72 0 02 0 . 



EELHRRAHAIDPHISIATVYRTVRLFEESGIIERHDFRDGRSRYEETP-- 

EELYRRSSAVDPRISISTVYRTVKLFEDAGIIERHDFRDGRSRYETVP-- 

EELYRRCVAVDDKISISTVYRTVKLFEDAGIIERHDFREGRARYETMR-- 

EDLYKKLIDIGEEIGLATVYRVLNQFDDAGIVTRHNFEGGKSVFELTQ-- 

EDLYKILIDIGEEIGLATVYRVLNQFDDAGIVTRHNFEGGKSVFELTQ-- 

EDVYRNLLHEELDIGLATVYRVLTQFEQAGLLSRSNFESGKAVFELNE-- 

GEIYRLLLETGEEVGLATVYRVLTQFEMAGLVRRHHFEGDKAVFELNE-- 

EELQELLDKRGEGISLSTIYRSVKLMSRMGILRELELAEGHKHYELNQPY 

EEICARVRETARGVNISTVYRTLELLEELGMVTHTHLGHGAPTYHLAA-- 

REIYDRLNQQGKEIGHTSVYQNLEALSSQGIIECIEHCDGRLYGNIND-- 



295687474 
325059641 
27375908 | 
51588753 | 
115348324 
53720551 I 
40388402 | 
484790861 
271962470 
298489843 



Iref |YP_003591167. 
|gb|ADY63332.1 | 
ref |NP_767437.1| 
emb|CAH20364.1 | 
|emb|CAL21256.1| 
ref I YP_109537 . 1 | 
gb I AAR85472.il 
gb|AAT44865.1| 

I ref I YP_003336666 . 
I ref I YP 003720020 . 



THHHDHLIDMKTGKWEFVDEEIEALQNAIARKLGYKLVDHRLELYGVPL 
EEHHDHLIDLKNSWIEFHSPEIEALQEKIAREHGFKLVDHRLELYGVPL 
DSHHDHLINLRDGKVIEFTSEEIEKLQAEIARKLGYKLVDHRLELYCVPL 
QHHHDHLICLDCGKVIEFSNESIESLQREIAKQHGIKLTNHSLYLYGHCE 
QHHHDHLICLDCGKVIEFSNESIESLQREIAKQHGIKLTNHSLYLYGHCE 
GSHHDHLVCLDCGRVEEFFDAEIETRQQSIAKERGFKLQEHSLAMYGTCT 
TGHHDHMVCTACGKVLEFFDEMLEARQRELAANRGFFISDHSLYLYGTCL 
PHHHHHLVCIQCNKTIEFNNDSILKHSLKQCEKEGFQLIDCQLTVMAICP 
DSDHVHLVCHECGEINEARPEWQEFVTKLDEELGFAIDVHHLTVFGRCR 
— AHSHVNCVDTNQILDVHIELPAELIQQVEAQTGVKIIAYTINFFGHRN 



gi I 2 95687474 I ref I YP_003591167 . 
gi I 325059 641 1 gb | ADY63332 . 1 | 
gi I 27375908 | ref | NP_7 67 4 37 . 1 1 
gi I 5158 8 753 | emb ICAH2 0364 . 1 | 
gi 1 11534 8324 | emb I CAL212 56. 1 1 
gi I 53 720 551 I ref | YP_10 9537 . 1 | 
gi I 4 0388 4 02 | gb | AAR85472 . 1 | 
gi I 4 847908 6 I gb | AAT4 4 8 65 . 1 | 
gi 1271962470 | ref | YP_0 0333 6666 . 
gi I 2 9848 9843 I ref | YP_0 03 720 02 0 . 



EE 

KPG EH 

ODD KPTS 

TGN CREDESAHSKR- 

TGN CREDESAHSKR- 

TEN CPYRKH 



GMQDVGICSLRDDDAPGASTD 

EALRMGWPSGIPSNWGCTRSLVDTRFQNCEIPESKEPEPEN 
NCR 

S 



Fig. 15.6 Sample ALN/ClustalW file format with aligned ferric uptake regulator proteins from ten different microorgan- 
isms generated using Clustal X version 1.81 program. Note the last line in the alignment file with three special 
characters, i.e., asterisk, colon, dot along with the conservation observed at that site. 



[Software can be downloaded from: http://www.genestudio. 
com/ download_seq . htm ] . 

3. Files: Download the Bmdyrhizobium japonicum ferric uptake 
regulator (fiir) gene (Accession number: AF052295) from 
GenBank database available at http://www.ncbi.nlm.nih.gov 
in GenBank format. 

15.2.2. Raw Sequence FinchTV (http://www.geospiza.com/finchtv.html) editing tool. 

Editing Tooi 
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Table 15.3 

Some commonly used sequence file format converting tools 



Tool 


Description 


Web address 


ReadSeq 


The multi-format molbio sequence reader is a 
Web-based utility that automatically identify 
input single or multiple nucleotide/protein 
sequences format and convert into user defined 
format 


http: / /searchlauncher.bcm.tmc. 
edu /seq-util/ Options /readseq. 
html 


Sequence 

Format 

Converter 


A simple tool that convert between common 
sequence format. The tool is based on SeqlO 
module of BioPerl 


http: / /www.bioinformaticsbox. 
com /tools /sequence_format_ 
converter.php 


Squizz 


The Web-resource provides two separate utilities: 
(1) sequence format converter tool and (2) 
alignment format converter tool 


http: / /mobyle.pasteur.fr/cgi-bin/ 
portal .py? #forms : :squizz_convert 


SeqVerter 


Standalone sequence file format conversion utility http: / /www.genestudio.com/ 
by GeneStudio, Inc. The software for various seqverter.htm 

sequence editing Rmctions 



15.2.3. Sequence 
Analysis Tools for DMA 
and Proteins 



1. Using Web-based BLAST for nucleotide sequences 

The Basic Local Alignment Search Tool (BLAST) available on 
the National Center for Biotechnology Information (http:// 
www.ncbi.nlm.nih.gov) is the key software to identify similar 
sequences and domain architecture of any given nucleotide/ 
protein sequence [ 1 ] . 

2. Hardware: Computer system with internet connection. 

3. Software: Web-browser (Internet Explorer, Mozilla Firefox, 
Safari, or Opera). 

4. Files: In the present protocol we will continue with our previ- 
ous example of ferric uptake regulator (fur) gene (Accession 
number: AF052295) from Bradyrhizobium japonicum. 



15.2.4. Nucleotide and 
Protein Sequence 
Analysis Using SDSC 
Biology WorkBench 



The Biology WorkBench is a Web- based utility for biologist to 
search and analyze nucleotide and protein sequence databases [2]. 
The database search is integrated with a wide variety of analysis 
and modeling tools. The Biology WorkBench was originally devel- 
oped by the Computational Biology Group at the National Center 
for Supercomputing Applications at the University of Illinois at 
Urbana-Champaign, and the ongoing development of version 3.2 
is occurring at the San Diego Supercomputer Center, at the 
University of California, San Diego. To use Biology WorkBench, 
users need to register at http://workbench.sdsc.edu. 
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Fig. 1 5.7 SeqVerter interface. Sequence can be uploaded either by drag and drop utility or by selecting import sequences. 



1. Hardware: Computer system with internet connection. 

2. Software: Web-browser (Internet Explorer, Mozilla Firefox, 
Safari, or Opera). 

3. Files: In the present protocol we will continue with our previ- 
ous examples of ferric uptake regulator (fur) genes and pro- 
tein sequences from various microorganisms. 



15.3 Methods 

15.3.1. Converting i. 

Sequence Format 2 

Using SeqVerter 

3. 



4 . 



Open the SeqVerter utility installed in the system (Fig. 15.7). 

Foad the fur gene in to the software by either drag and drop 
on the software interface or selecting import button. 

Select the check box in front of fi^r gene as mentioned in 
Fig. 15.8. As the file contains only one sequence click on Split 
sequences (Single sequence files) button. 

Select the desired location and the file format from the drop 
down menu of Export sinjqle sequence file widget (Fig. 15.9). 
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Fig. 15.8 ft/rgene AF052295 is upioaded in the software. Note the status of checkbox in front of sequence fiie name. 




Fig. 15.9 Export singie sequence files widget. Note the Export folder secim and File format drop down iist. 



Click on OK button to convert and export the sequence 
format at user defined location. 

5. The SeqVerter utility can also be used for multiple sequences. 
Figure 15.10 shows ferric uptalce regulator protein (Fur pro- 
tein) from ten different microorganisms. 

6. Various multiple alignment formats are available in the soft- 
ware. This format can be selected from file format drop down 
menu as shown in Fig. 15.11. 
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Fig. 15.10 Ferric uptake regulator proteins from ten different microorganisms are uploaded in the SeqVerter software. 
Note that all check boxes in front of Accession number are selected. Also note the Merge sequences (Multiple sequence 
file) button. 




Fig. 1 5.1 1 Various multiple alignment format in the Export multiple sequence file widget of the software. 

data 
and 
data 

processing. Some of the commonly used tools are listed in 
Table 15.4. 

One of the raw sequence editing tools FinchTV (http:// 
www.geospiza.com/finchtv.html) is shown in Fig. 15.12. The 
tool can read DNA sequence chromatogram files. Trace peak can 
be scaled vertically and horizontally using Vertical and Horizontal 
scale sliders present on the left and bottom, respectively. Bases can 



15.3.2. Raw Sequence Biological sequences are delivered as standard chromatogram 
Editing Tooi and text files. Several software utilities are present to read 

manipulate these sequence files for further bioinformatics 
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Table 15.4 

Commonly used raw sequence editing tools 



Tool 


Description 


Web address 


FinchTV 


Geospiza’s FinchTV is the popular way to view DNA 
sequence traces. It can display entire trace in a scalable 
multi-pane view. It provides the utility to perform 
BLAST search, reverse complement sequence, and 
trace 


http://www.geospiza. 
com/Produ cts/finchtv. 
shtml 


SeqPup 


Biological sequence editor and analysis program. It 
includes links to network services and various external 
analysis programs such as clustal, cap, fastdnaml, and 
tacg 


http ://iubio . bio .indiana. 
edu/ soft / molbio / 
seqpup 


Sequin 


A standalone utility for submitting and updating 
sequence entries to GenBank, EMBL, or DDBJ 
sequence databases 


http://www.ncbi.nlm.nih. 

gov/Sequin/index.html 


BioEdit 


BioEdit is a sequence alignment editor with several 
sequence manipulation and analysis options and link to 
external analysis program 


http://www.mbio.ncsu. 

edu/BioEdit/bioedit. 

html 


JaMBW 


Java- based Molecular Biologist WorkBench program 
provide utility for sequence format conversion, 
sequence manipulation, and various sequence analysis 
options, such as composition, feature viewer, ORF, 
isoelectric point, antigenic index, and oligo calculator 


http://www. 

bioinformatics.org/ 

JaMBW/ 


Sequencer 

4.10.1 


Industry standard sequence analysis software for all 
automated sequences including next generation 
sequencers. Some of the applications include de novo 
gene sequencing, mutation detection, systematic, etc. 


http://www.genecodes. 

com/ 


CodonCode 

Aligner 


Program for sequence assembly, contig editing, feature 
detection and mutation detection. The software is 
available for window as well as mac system 


http://www.codoncode. 

com/aligner/index.htm 


4Peaks 


Sequence trace file viewing and editing system for mac 
OS. Software can read multiple file format, 
automatically translate sequence, and provide interface 
to add plugins to enhance functionality 


http :/ /www. mekentos j . 
com /science /4peaks / 


PriorsEditors 

1.0.10 


A general editor for regulatory region analysis and 
transcription factor binding site discovery. De novo 
motif discovery program PRIORITY is bundled with 
the software 


http :/ /tare . medisin . ntnu . 
no /priorseditor /index, 
php 



be inserted and deleted from the desired position. Sequence fea- 
ture can be identified by performing online BLAST search avail- 
able with the software. 

15.3.3. Sequence 1. Open the Web browser with the following address http:// 

Analysis Tools for DNA blast.ncbi.nlm.nih.gov. There are two major sections for 

and Proteins BLAST tool: (a) Basic BLAST and (b) Specialized BLAST as 
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Fig. 15.12 FinchTV DNA sequence trace file reader. A sample trace file is uploaded in the software. Trace file can be 
edited by selecting part of the sequence and various options available in Edit menu. 
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BLAST finds f«oions of simildniv b«tw««n btotogiCAl seqiMnces. mor* 

I ^3 AJI^*I9 Mu»p<* ProMn Sc^aom? Tiy tM COOALT MuMpto AJI^mmM TmL 0*j 
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** Ant>i<k>DSt» th ih»M 
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Fig. 15.13 Home page of NCBI BU\ST server (http://blast.ncbi.nlm.nih.gov). Note five Basic BLAST programme available 
for nucleotide and protein sequences. 

seen in Fig. 15.13. Select nucleotide BLAST under Basic 
BLAST tool section. 

2. Paste the accession number AF052295 in the input box 
provided {FASTA sequence or jqi number can also be submitted 
for BLAST search). BLAST tool will automatically assign a job 
title from the description available for the accession number. 
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* ► D «*>jp at* 



Fig. 15.14 Web interface for nucleotide BLAST search tool. Accession number for fur gene from Bradyrhizobium 
japonicum is provided as input. Nucieotide coilection (nr/nt) database is selected for searching. 



User can also provide the job title to distinguish various jobs 
running by the user (Fig. 15.14). 

3. Query subrange can be defined if the user wants to perform 
BLAST search for part of the sequence only. 

4. Databases are divided into three broad sections. Select “Other 
(nr etc.)” radio button and then select Nucleotide collection 
(nr/nt) from the drop down menu. List of databases is 
provided in Table 15.5. 

5. With all the default options selected, click on the BLAST 
button. To increase the performance, sensitivity, and selectiv- 
ity of the BLAST search, various options are available in 
advanced BLAST search. 

15.3.3.1. BLAST Result The basic nucleotide BLAST (blastn) output is shown in 
Fig. 15.15. A total of 24 BLAST hits returned by the server for 
the query sequence searched. Graphical summary of BLAST result 
is provided based on the alignment score to quicldy identify the 
best matched sequences and their query coverage. Taxonomy and 
distance tree of results can be designed to find the similar sequence 
in diverse organisms. 
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Table 15.5 

List of databases available for online nucleotide BLAST 
search 

Human genomic plus transcript (Human G + T) 

Mouse genomic plus transcript (Mouse G + T) 

Nucleotide collection (nr/nt) 

Reference mRNA sequences (refseq_rna) 

Reference genomic sequences (refseq_genomic) 

NCBI Genomes (chromosome) 

Expressed sequence tags (est) 

Nonhuman, non-mouse ESTs (est_others) 

Genomic survey sequences (gss) 

High-throughput genomic sequences (HTGS) 

Patent sequences (pat) 

Protein Data Bank (pdb) 

Human ALU repeat elements (alu_repeats) 

Sequence tagged sites (dbsts) 

Whole-genome shotgun reads (wgs) 

Environmental samples (env_nt) 



15.3.4. Nucleotide and 
Protein Sequence 
Analysis Using SDSC 
Biology WorkBench 



1. Point the Web browser to http://workbench.sdsc.edu. Click 
on the links to enter into Biology WorkBench, if already 
registered on the server or else click on register link as shown 
in Fig. 15.16. 

2. Basic structure of Biology WorkBench can be defined into 
three major sections. These include (1) Tool Sets, (2) 
Sequence, and (3) Tools as shown in (Fig. 15.17). 

Tool sets currently include: 

(a) Session Tools 

(b) Protein Tools 

(c) Nucleic Tools 

(d) Alignment Tools 

(e) Structure Tools (Alpha) 

(f) Report Bugs 

The WorkBench views sequences as objects on which it can 
perform tasks. These sequences can be imported either 
through your machines or can be searched with different 
biological databases. The Protein Tools take protein 
sequences as inputs. The Nucleic Tools take nucleotide 
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> new blast; Oliitn 8utta/ Formatang ReiuKt - KWCAjmmOIR 



EdII and Resutimit 

AF052295:Bradyrhizobiumjaponicum ferric uptake. 



> Download 



Query ID oi|;?7Z5?0lgfll A f05;;95.ll 
Description Bradyrhizobium japonicum feme uptake regulator (fur) 
gene, complete cds 
Molecule type nudeic add 
Query Length 832 

Other reports: > Search Summary fTaxQnomv reportsl fOistance tree of resuhsi 

© Graphic Summary 



Database Name nr 

Description All GertBank+ EMBL+DOBJ-f PDB sequences (but no EST, 

STS, GSS,environmental samples or phase 0, I or 2 KTGS 
sequerKes) 

Program BLASTN 2.2.26^ > Citation 



Oisrijufionof 24BbstHMsontheQuety Sequence 
[Mouse over to see ttw dell ne. d'ideto show alicrments 



Color Icay for alignment scores 
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© Descriptions 

Legend for links to other resources: Q} UniGene El GEO B Gene B Structure IZi] Map Viewer fil PubChem BloAssay 



Sequences producing significant aUgnmentsi 



Accession 

BAQQ0040.2 

AFQ322Sgg 

AY3575aS.l 

CU234U8.I 

CPQ0Q494.1 

CPQQQ4e3.1 

CP000230.1 

CP002418.1 



Description 

Bradyrhizobium japonicum USDA 110 DNA, complete genome 
Bradyrhizobium japonicum ferric uptake regulator (fur) gene, com 
Bradyrhizobium japonicum USDA 6 DNA, complete genome 
Bradyrhizobium japonicum strain 61A152 ferric uptake regulator ( 
Bradyrhizobium sp. ORS278, complete sequence 
Bradyrhizobium sp. BTAil, complete genome 
Rhodopseudomonas palusths Bi$A53, complete genome 
Rhodopseudomonas pakistris HaA2, complete gerx)me 
Rhodopseudomonas pakistris DX*l, complete genome 
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Fig. 1 5.1 5 Blastn output, (a) Distribution of hits and coior-coded graphical result, (b) List of related sequences, accession 
number, description, score. E-value, and database linkages, (c) The detaiied pairwise aiignment between the query 
sequence and the database sequence. 
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SDSC 

SAN DIEGO SUPERCOMPUTER CENTER 

Biology WorkBench 



The Biology WorkBench is a web-based tool for biologists. The WorkBench allows biologists to search many popular protein and nucleic 
acid sequence databases. Database searching Is Integrated with access to a wide variety of analysis and modeling toots, aH within a point and 
dick Interface that eliminates file fomiat compatiblity problems. 

Rrst time users; please register for a free account. 

Click to Enter the Biology Workbench 3.2 

Forgotten Pasword: there are two ways this can be fixed, once we verify you own the account in question. One is for you to register for a 
new account, and ^ve can transfer the data from your old account The second Is for us to reset your old account. ar>d then you can register for 
it again - your old data will still be available. Please mail bwfahelpdaisdsc.edu and let us know which option you prefer. 



Announcing the Next Generation Biology Workbench 

The Next Generation Biology Workbench (NGBW) is r>ow available for public use. inits v.1.5 full production release. Please visit the NGBW at 
wvvw.nobw.oro . and provide feedback. Some of the new features that the NGBW contains are: 

• Imports data from the original Biology Workbench 

• Full support for modeling and \nsuaiization of biological structures, including an intergrated tool (Sirius) 

Fig. 15.16 Home page of SDSC Biology WorkBench at http://workbench.sdsc.edu. 



sequences as inputs and the Alignment Tools take aligned 
sequences as inputs. Alignment between nucleotide/protein 
sequences can be performed using tools in either the Protein 
or Nucleic Tools set. The output of the programs will be 
automatically submitted into the Alignment tools, where 
you can perform operations on aligned sequences. 

3. Click on the Session Tools button to start the default session of 
the Biology WorkBench. You can rename and save the session 
so that you can work on the same sets of sequences in future. 

4. Click on the Nucleic Tools button. The programs associated 
with the nucleotide sequences analysis will be listed in the 
dropdown menu as shown in the (Fig. 15.18). Select the Add 
New Nucleotide Sequence and click on Run button. 

5 . Sequence can be uploaded either by selecting Browse button 
and then Upload Tile button (Fig. 15.19). Sequence can also 
be entered manually on the server by providing Label and 
Sequence data. 

6. Click on the Save button to save the sequences on the current 
session for further analysis. Various sequence analysis tools are 
available on WorkBench. Analysis can be performed by select- 
ing the checkbox in front of each sequence and selecting the 
appropriate tool from the list followed by selecting Run but- 
ton (Fig. 15.20). List of various tools available for nucleotide 
sequences is provided in Table 15.6. 
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frwnui ond buMbra 


•] httpy/Workbench. sdsc.edu ^ 



Version 3.2 




Updates 

FAQ (Frequently Asked Questions) 
Problems with the Biology Workbench can be directed to: 
E-maM Address: bwbhelo^sdsc.edu 



Setting up Helper Applications used by the Biology WorkBench 3.2. 



I SttttonTooli 1 1 PfottinTooli 1 1 Nucleic Tools 1 1 Aliy merH Tools 1 1 Structure Tools (Alpha) | 
Colon <s> Gray O Rose O Blue 



The Biology WorkBench 3.2 provides a point and click interface for rapid access of biologicai databases and analysis tools. 

The URL for the Biology WorkBench 3.2 canrtot be Incorporated or linked by any commercial product or for any convnerdal enterprise without 
prior license agreement with the Urdversity of Illinois. 

Ul MAKES NO REPRESENTATIONS ABOUT THE SUITABILITY OF THIS SOFTWARE FOR ANY PURPOSE. IT IS PROVIDED "AS IS" 
WITHOUT EXPRESS OR IMPLIED WARRANTY THE Ul SHALL NOT BE LIABLE FOR ANY DAMAGES SUFFERED BY THE USERS 
OF THIS SOFTWARE. 

The Biology Workbench was orIginaVy developed by the Computational Biology Group at the National Center for Supercompudng Applications 
at the University of linols at Urbana-Champaign. and the ongoing development of version 3.2 Is occuring at the San Diego Supercomputer 
Center, at the Uruversity of California. San Diego. The development was and is directed by Professor Shankar Subramarkiam. currently of 
UC'San Diego. The Biology WorkberKh is currently supported and developed by Andrela Maer. Brian Saunders, and Roger Unwin. Former 
developers of programs in the Biology Workbench 3.2 ir>dude Dawn Cotter, Mike Famum. Mike Parlee. Jim Fenton. Amy Stephens. Mark 
Whitsiit. Geoff Mann, and Jim Miller. 



tSlI. Bit'S eT'-un**! efiPt ef ilHtM 

SDSC 

Fig. 15.17 SDSC Biology WorkBench. Note five buttons to initiate the work. 



15.3.4. 1. Calculation of 
Nucleotide Sequence 
Statistics 



First, we will check the composition of Bradyrhizobium japonicum 
/«r gene, complete cds using NASTATS tool. Select the check box 
in front of sequence and select NASTATS Tool from the list under 
Nucleic Tools Set and click on RUN button. The output of the 
NASTATS tool is shown in Fig. 15.21. 



15.3.4.2. Align Two 
Sequences Using BL2SEQ 
Utility 



In the present protocol, we will compare gene sequences from 
two different organisms using BLAST utility available on Work- 
Bench. Select the check box in front of fur gene from Bmdyrhi- 
zobium japonicum. Select the BL2SEQ utility under Nucleic Tools 
set. Click on Run button to submit the job. In the next screen. 
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Fig. 15.1 8 Sequence can be uploaded to the WorkBench by selecting Add New Nucleotide Sequence too\. Note the Empty 
tag, which shows that currently no sequence is associated with the WorkBench. 



, Biology WofkBench 



httpyMoikbench.sdsc.ecKj 




Warning: 

la portag aWyg aumbn oC itqaoictt kAr t«\r wgsest ^ oot to npovi SDore duo 64 te<|oncet M the \mae qbp^ 

Acc«pcaU# iBp*! Foimat^: 

Pe«KnFMi,mr. GoEakOB. EMBL, PIR CODATA. MSF. GCG, NBRF, Ftch. Plrvip (bodi ntnWjntd aad MB-faacdanv(0> PAL*PN£XQ^ PDBFnl(t 
rcccrd or PDB Fie HridcT (HEADER nd SQRES io^ * 

Label; 

Saqaeare: 



I Yoa aploai a re^aeara file froa >oar local aucbiae. 



C ’Oocunaeti tod S iari g i »h»«o| |H Up*cadf4» , 



Fig. 15.19 Snapshot of Upload sequence utility on Biology WorkBench. Note the acceptable sequence input formats 
available with the WorkBench. 
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biology WorkBench 



httpy/Mrori(ben<hA<H< mIu 





rd .\ctoHt>iWf Wwaa*U lar f «•« 

rtMrHddIt far g«»«. cfglrtt tdt 
P*««da*Mut Mnifteaia far grar. fiflirt fii 
\ ArU aagaiUarwi tar g«a*. HflH> cd« 

Nriiwrta ■laiagtiitatfllf gra*. cifliti rdi^ 

Bardf M tla grtxaiiil tar gra«. raagWf* cdt 
'ttl|Mad lnd«TUMMaa fafaakaa tar gra#, tiflitr rdt 

Aclf>lafcirniai tarraaitatT ttrato ATCC tar gra«. 

AjatfAr>Ilaai kntUratr uraia Sf* tar gra«, caofWt* <dt 
\'ftrto talaUka* itrala UOd-24 tar g«a«. cif Uf» cdt 






Fig. 1 5.20 Total ten nucleotide sequences from different organisms for fur gene is uploaded on the current session. Note 
the check box in front of each sequence and the various tools associated with Nucleic Tools set. 



select from the list of organism Azospirillum brasilmse [you can 
also select more than one sequence for comparison]. Enter the 
Expectation value 10 and click on Submit button. The output 
generated by the tool is shown in Fig. 15.22. A small local align- 
ment region between the sequences from two organisms is shown 
as pairwise alignment. The position of aligned fragment, identity, 
and similarity score is also returned. 



15.3.5. Nucleotide 
Sequence Similarity 
Searching Using 
BLASTN Tool on 
Biology WorkBench 



Select the check box in front of fur gene from Bradyrhizobium 
japonicum. Select BLASTN tool from the Nucleic Tools set and 
click on Run button. BLASTN compares a single DNA sequence 
(query sequence) to all sequences in a database (library sequences) 
to find those library sequences having the greatest similarity to the 
query sequence. BLASTN uses a heuristic algorithm to allow 
faster database searching. On the following screen, select data- 
bases to be searched (A maximum of 16 databases can be searched 
in single run). Search can be performed with both the forward and 
backward strand of the query sequence (Fig. 15.23). Select Gen- 
Bank Bacterial Sequences [part 1] database from the list and click 
on submit button. The output generated by the Biology Work- 
Bench is shown in Fig. 15.24. We can also fetch the details of the 
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Table 15.6 

List of various utilities available with Nucleic Tool Set of 
Biology WorkBench 

Select All Sequences 

Deselect All Sequences 

Ndjinn — Multiple Database Search 

Retrieve BATCH Output 

Add New Nucleic Sequence 

Edit Nucleic Sequence(s) 

Delete Nucleic Sequence(s) 

Copy Nucleic Sequence(s) 

View Nucleic Sequence(s) 

Download Nucleic Sequence(s) 

View Database Records of Imported Sequences 

BL2SEQ — Compare nucleotides to each other with BLAST 

BL2SEQX — Compare a nucleotide to protein sequences with BLAST 

BLASTN— Compare a NS to a NS DB 

BLASTX — Compare a PS-Derived-from-NS to a PS DB 

TBLASTX — Compare a translated NS to a translated DB 

PASTA — Nucleic Acid Sequence Comparisons (NS or DB) 

FASTX— Compare Translated NS to PS DB 
PASTY— Compare Translated NS to PS DB 
SSEARCH — Smith-Waterman Local Alignment 
CLUSTALW — Multiple Sequence Alignment 

CLUSTALWPROF — Align Sequences to Existing Alignment (Profile) 

ALIGN — Optimal Global Sequence Alignment 

LALIGN — Calculate Optimal Local Sequence Alignments 

LEASTA — Calculate Local Sequence Alignments (Heuristic) 

PATTERNMATCHDB — Search for Regular Expressions (Patterns) in a 
nucleic sequence DB 

PATTERNMATCH — Search for Regular Expressions (Patterns) in a 
nucleic sequence 

TACG — ^Analyze a NS for Restriction Enzyme Sites 
PRIMERS — Design Primer Pairs and Probes 
NASTATS — Nucleic Acid Statistics 
BESTSCOR — Calculate the Best Self-Comparison Score 

(continued) 
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Table 15.6 
(continued) 

PFSCAN — Sequence Search Against a Set of Profiles (PROSITE) 

PRIMERCHECK — Calculates melting point, length, %GC for a primer 
sequence 

PRIMERTM — Designs end primers based on a minimum Tm 
SIXFRAME — Generate and Import 6 Frame Translations on a NS 
REVCOMP — Generate Reverse Complement of NS 
RANDSEQ — Randomize a Sequence 
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Fig. 15.21 Result screen of NASTATS Tool for Bradyrhizobium japonicum fur gene. Note the nucleotide percentage and 
GC content of the gene. 



database sequence by selecting check box in front of each 1-line 
description followed by Show Records button. The output result 
is arranged in the decreasing order of bit score, which is defined as: 

S' {bits) = [lambda x S{raw) — In K ]/ In 2 
where lambda and K are Karlin-Altschul parameters. 
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Fig. 15.22 BL2SEQ output for Bradyrhizobium japonicum and Azospirillum brasilense fur gene. 
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Fig. 15.23 BLASTN input screen on WorkBench. Expectation value, 1-line description, and no. of alignments returned by 
the server can be adjusted. 
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Fig. 15.24 BLASTN output screen on WorkBench. (a) Total number of database sequences matched with the query 
sequence, (b) 1 -line description of result. Database name, accession number, title of the sequence, bit score, and e-value 
are shown, (c) The pairwise alignment of the query sequence with database sequence. 
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Fig. 15.25 Snapshot of the screen to adjust various parameters in CLUSTALW— Multiple Sequence Alignment Tool. 



Expect value estimates the statistical significance of the match, 
specifying the number of matches, with a given score, that are 
expected in a search of a database of this size absolutely by chance. 



15.3.6. Multiple 
Sequence Alignment 
Using CLUSTALW 
Utility on Biology 
WorkBench 



The Clustal programs are widely used for carrying out automatic 
multiple alignments of nucleotide or amino acid sequences. The 
most familiar version is ClustalW [3]. Select the checkbox in front 
of all the ten fur genes from various microorganisms uploaded on 
the Biology WorkBench server in the earlier protocols. Select the 
CLUSTALW — Multiple Sequence Alipfnment option from the 
Nucleic Tools set. Click on the Run button. On the next screen, 
various pairwise and multiple alignment parameters can be 
adjusted. Select Alipined from Output order dropdown list and 
Rooted and Unrooted Trees from Guide tree display dropdown 
list. From the Multiple Alijynment Parameters section, select the 
ClustalW (1.6) option out of Weight Matrix dropdown list 
(Fig. 15.25). Click on the Submithntton to run the CLUSTALW 
tool. The output is shown in Fig. 15.26a-c. Other multiple align- 
ment tools available in the public domains are listed in Table 15.7 
and phylogentic analysis tools are listed in Table 15.8. The align- 
ment file generated by the CLUSTALW can be imported 
on the Biology WorkBench by selecting Import Alignment (s) 
button. The ClustalW aligned file can be further processed by 
various tools in the Alijynment Tools set. 
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C77 33C3AA3A3A 
AT 33ST SAAS AAA 



7C337C77 

7733CC77 

7C3G7CTG 



CC3AAGA7A777ACAAGACT77G77AGA3CAA3GGGAAGA7G7C3GACTT 



Acadarhaobaeallut^ferxooxada 
KeattexaajB€r.angaradat_£ux_g 
Border ella_pexru5sas_£ux_gen 
BradyrhatobauB ^aponacus fur 
ASOtparalluB^bratalente^srra 
rteudOBonat^aeruganota^fur^g 



CC 3G33AAA7 C7 ACC33C7 37 7 3C7 33A3ACC3G7 GAAGAAGT 333CC7G 
CC3A33A73737ACC37A777737733AA3A33373733AAA7C337373 
CG3AA3ATG7CTATC3C3CGCTGATCGCC3ASAA7 ST S3AAATC33GC7G 
X C SAG S AA7 7 AT ACCSCCGC7 GCG7 CGCCST CSACGACAAGA7 CT CGA7 C 
7 33A33A337 CCA7C3CC3C3CC37 CC7 GA7 C3ACCC3C3CA7C7CCA7 C 
CC 3AA 3 AC37 37 ACAA33C3C7 SAT 33 AA3CA33C3 A33AC37 333CC7 3 



b 



Clustal dcDdrogram 



Varooted tret (f eoerated b>’ Pb>1ip'« Drawtree) 




Fig15.26 (continue) 
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C Rooted tree (generated by Phylip's Diawgrain) 

3o-*nloid » PottSoipt ■■ ftnon of tht output 

Acinetobocter baumannii fur 

Solmonella enteritidis fur <3 

vibrio onguillorum fur gene 

Vibrio vulnificus stroln M06 
Acidithiobocillus ferrooxido 

Neisseria meningitidis fur g 

Pseudomonos aeruginosa fur g 

Brodyrhizoblum joponicum fur 

■' Azospirillum brasilense stra 
Bordetella pertussis fur gen 



Fig. 15.26 (a) The multiple alignment screen for ten ft/r genes from different organisms. The aligned regions are 
highlighted in blue color, (b) Unrooted tree generated by Phylip’s Drawtree program between multiple aligned sequences, 
(c) Rooted tree generated by Phylip’s Drawtree program. Note the similar organisms based on the distances of ftvrgenes. 

Table 15.7 

List of frequently used multiple sequence alignment tools 



Name 


Description 


Web address 


MAFFT 


Recent developments in the MAFFT multiple sequence 
alignment program 


http://align.bmr.kyushu-u. 
ac.jp/mafft/ software / 


Clustal-W 


ClustalW2 is a general purpose global multiple sequence 
alignment program for DNA or proteins 


http://www.ebi.ac.uk/ 

clustalw 


PCMA 


Fast and accurate multiple sequence alignment based on 
profile consistency 


ftp ://iole . swmed . edu/ 
pub/PCMA/ 


ProAlign 


A hidden Markov model for progressive multiple alignment http: / /ueg.ulb. ac.be/ 

ProAlign/ 


A4AVID 


Constrained ancestral alignment of multiple sequences 


http://baboon.math. 

berkeley.edu/mavid 


ABA 


A novel method for multiple alignment of sequences with 
repeated and shuffled elements 


http://nbcr.sdsc.edu/euler 


POA 


Combining partial order alignment and progressive 
sequence alignment increases alignment speed and 
scalability to very large alignment problems 


http : // WWW. bioinforniatics . 
ucla.edu/poa/ 


DIALIGN 


DIALIGN-T: an improved algorithm for segment-based 
multiple sequence alignments 


http://dialign.gobics.de/ 

chaos-dialign-submission 


NRAlign 


Improving accuracy of multiple sequence alignment 
algorithms based on alignment of neighboring residues 


http : / /faculty, cs . tamu .edu/ 
shsze/nralign 




238 



Smita et al. 



Table 15.8 

List of important phyiogenetic anaiysis tools 



Name 


Description 


Web address 


Phylip 


Is a free package of programs for inferring phylogenies. It 
is distributed as source code, documentation files, and a 
number of different types of executables 


http:/ /evolution. genetics. 
Washington .edu /phylip . 
html 


Phytogeny. 


Is a simple to use Web service dedicated to reconstructing http://www.phylogeny.fr/ 


fr 


and analyzing phylogenetic relationships between 
molecular sequences. Phylogeny (PhyML, MrBayes, 
TNT, BioNJ), tree viewer (Drawgram, Drawtree, ATV) 


version2_cgi/index.cgi 


PHYML 


Is a simple, fast, and accurate algorithm to estimate 
maximum likelihood phylogenies from DNA and 
protein sequences 


http:/ / atgc.lirmm.fr/ 
phyml/ 


ProtTest 


Estimates the empirical model of amino acid substitution 
that fits the data best among 64 candidate models 


http:/ /darwin.uvigo.es/ 


POWER 


To carry out phylogenetic analysis on most programs of 
PHYLIP package repeatedly 


http://power.nhri.org.tw/ 
power /home .htm 


CVTree 


Constructs whole-genome-based phylogenetic trees 
without sequence alignment by using a Composition 
Vector (CV) approach 


http:/ /tlife. fudan.edu.cn/ 
cvtree/ 




Fig. 15.27 Some of the Primers input parameters for designing optimai primer for the selected sequence on Bioiogy 
WorkBench. 
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No luopriming library opecifiet] 
Using 1-bssed segrjsr.ce posicior.s 



Optimal Primer Pair/Probe 



OLIGO 


SCftrC 


len 




gc% 


any 


3‘ 


9cg 


LEFT PRIMER 


»0 


20 


60.09 


50.00 


4.00 


0.00 


GAG<GTCCAGGAAAACAACCA l-i} 


RIGHT PRIMER 


209 


20 


€o.ia 


45.00 


6.00 


2.00 


CAGAAGGGCGTTCAATTGTT L? 



SEQOCHCS SIZE: 832 
INCLUDED REGION SIZE: 832 

PRODUCT SIZE: 120, PAIR ANY COMPL: S.OO, PAIR 3' COHPL: 3.00 

1 GGGCCCATCCCCGCCGCGGCCTCTCCCACACGCTGCTGATaAGCCATCTCGGCCACCTCG 

61 CCGGGCGCGGCGTGCGCACGATATTTCTCGAGGrCGAGGAAAACAACCAGCCGGCACGGC 

>»»»»»»»»»> 

121 GGCTCTACGCGAGGTGCGGATTCATGGTGG7CGGCCGCCGCGAACGCTACTATAAACAGC 

181 CGAACGGGGAACAATrGAACGCCCTTCTGATGCGGCGTGACTTGTCGTAACATTSATGGC 
<<<<<<<<<<<<<<<<<<<< 

Z'*! ACAAAGCGCCCCGTCAGGCAGACASATCATGACCGCACTGAAACCrrCrrCTGCATCCAA 
301 GGOGTCCGGCATCGASGCGCGCTGTGCCGCCACCGGCATGCGCATGACCGAGCASCGCCG 
361 CGTCATCGCCCGCGTGCTCGCGC^GGCGGTCGATCATCCCGACGTCGAGGAATTATACCG 
<21 CCGCXGCGTCGCCOTCGACGACAAGATCTCGATCrCAACCGTOTATCGCACCCTCAAGCr 
481 GTTCGAGGATGCCGGCATCATCGAACGCCATGACTTTCGCGAGGGACGCGCGCGCTACGA 
541 GACGATGCGCGACAGCCA7CACGACCACCTCATCAATCTGCGCGACGGCAAGGTGATCGA 
Fig. 15.28 Output screen of Primers software on Biology WorkBench. 



15.3.7. Designing of 
Primer Sets to Ampiify 
Desired Region of a 
Sequence 



A primer is a strand of nucleic acid that serves as a base for DNA 
synthesis. They are essential for DNA multiple copies and replica- 
tion. The replication starts with DNA Primer and can only add new 
nucleotides to an existing strand with the help of DNA polymerase. 
The polymerase starts replication at the 3'-end of the primer and 
copies the opposite strand. A primer is a short string of nucleotides 
(usually 15-30) that are complementary to the first part of the 
segment of DNA that is being copied. The sides of a DNA molecule 
are antiparallel, that is, they run in opposite directions. Since DNA 
polymerase operates from the 5' (phosphate) to the 3' (sugar) end, 
this means that two different DNA primers are needed — one for 
each 5' end of side of the molecule. Pairs of primers should have 
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Table 15.9 

List of commonly used primer/probe designing tools 



Name 


Description 


Web address 


AutoPrime 


Primer design for real-time PCR measurement of 
eukaryotic gene expression 


http :// WWW. autoprime . de/ 
AutoPrime Weh 


CODEHOP 


COnsensus-DEgenerate Hybrid Oligonucleotide 
Primers designed from protein multiple sequence 
alignments 


http:/ /bioinformatics. 
weizmann.ac.il /blocks / 
codehop.html 


ExonPrimer 


Design intronic primers for PCR amplification of 
exons. Input needed: a cDNA and the 
corresponding genomic sequence 


http://ihg.gsf de/ihg/ 
ExonPrimer.htm 


NetPrimer 


Java applet for primer design 


http:/ /www.premierbiosoft. 
com / netp rimer / 


PrimerS 


Utility for locating oligonucleotide primers for PCR 
amplification of DNA sequences 


http://frodo.wi.niit.edu/ 
printers / 


PrimerX 


Automated design of mutagenic primers for site- 
directed mutagenesis 


http : / / WWW. bioinformatics . 
org/primerx/ 


Primo Pro 


PCR Primer Design 


http://www.changbioscience. 

com/primo 


Web Primer 


Primer design and sets for amplifying yeast ORFs 


http://www.yeastgenome. 
org/ cgi- bin / web -primer 



similar melting temperatures since annealing in a PCR occurs for 
both simultaneously [4]. Primers should not easily anneal with 
other primers in the mixture; this phenomenon can lead to the 
construction of “primer dimer” products contaminating the mix- 
ture. Primers should also not anneal strongly to themselves, as 
internal hairpins and loops could hinder the annealing with the 
template DNA. Desired characteristics of an automated DNA 
sequencing primer design are: 

• Based on accurate sequence 

• Melting temperature 52-65 °C 

• Absence of self-hybridization 

• Absence of significant hairpin formation (>3 bp) 

• Lack of secondary priming sites 

• Low specific binding at the 3' end (i.e., lower GC content to 
avoid mispriming) 

Biology WorkBench provides interface to Primer3 software, 
which is a widely used program for designing PCR primers. 
Primer3 predicts primer based on following criteria: 

(a) Oligonucleotide melting temperature, size, GC content, and 
primer-dimer possibilities 
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15.3.8. Bioinformatics 
Tools for 
Metagenomics 



15.3.9. Bioinformatics 
Toois for 
Metaboiomics 



( b ) PCR product size 

(c) Positional constraints within the source sequence 

(d) Miscellaneous other constraints 

All of these criteria are user specifiable as constraints, and 
some are specifiable as terms in an objective function that char- 
acterizes an optimal primer pair. Select the fur gene from Bmdyr- 
hizobium japonicum. Choose the PrimerS Tool from the Nucleic 
Tools Set. Click on Run button. Next screen (Fig. 15.27) will 
appear to provide Sequence Repfion Selection and Primer Seleetion 
Criteria to design the best primer for the selected sequence. The 
Printers output on Biology WorkBench is shown in Fig. 15.28. 
Both left and right primer sets are shown in the figure. The output 
include start position of the primer, their length, melting temper- 
ature, GC %, any (self- complementarity score of primer; taken as a 
measure of its tendency to anneal itself or from secondary struc- 
ture), 3' (self-complementarity: taken as a measure of its tendency 
to form a primer-dimer with itself) and the primer sequence are 
provided in the result. The left and right primer sequence can also 
be uploaded on the Biology WorkBench session simply by press- 
ing Import Sequence(s) button. Other commonly used primer 
designing tools are listed in Table 15.9. 

Metagenomics is a rising field in which the control of genomic 
analysis (the analysis of the entire DNA in an organism) is useful to 
whole communities of microbes, bypassing the requirement to 
isolate and culture individual microbial species. By permitting the 
direct investigation of bacteria, viruses, and fungi irrespective of 
their culturability and taxonomic identities, metagenoniics has 
altered microbiological theory and methods and has challenged 
the classical concept of species. This latest field of biology has 
proven to be rich and comprehensive and is making significant 
contributions in numerous areas including ecology, biodiversity, 
bioremediation, bioprospection of natural products, and in medi- 
cine. Various bioinformatics tools have been developed for meta- 
genomics analysis. Some of them are listed in Table 15.10. 

Metaboiomics is the “organized study of the unique chemical 
fingerprints that exact cellular processes leave behind,” the study 
of their small-molecule metabolite profiles. The metabolome 
represents the group of all metabolites in a biological cell, tissue, 
organ, or organism, which are the end products of cellular pro- 
cesses. One of the challenges of systems biology and functional 
genomics is to incorporate proteomic, transcriptomic, and meta- 
bolomic information to give a more complete picture of living 
organisms. Some of the tools have been listed in Table 15.11. 
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Table 15.10 

List of important metagenomics anaiysis tools 



Name 


Description 


Web address 


MEtaGenome 

Analyzer 


Taxonomic analysis 


http://ab.inf.uni-tuebingen. 
de / software / niegan/ 


Integrated 

Microbial 

Genomes 


Microbial Genome Data Management and 
Analysis Systems 


http://img.jgi.doe.gov/ 


PyroTagger 


Classify multiplexed amplicon pyrosequence data 
from any region of the SSU rRNA gene when 
provided with barcode and primer sequences 


http ://pyrotagger.j gi-psf 
org/cgi-bin/index.pl 


CLaMS 


Sequence composition- based classifier for 
metagenomic sequences 


http://clams.jgi-psf.org/ 


FAMeS 


Simulated data sets to evaluate the fidelity of 
metagenomic processing methods 


http://fames.jgi-psf.org/ 


GOLD 


Comprehensive access to information regarding 
complete and ongoing genome projects, as well 
as metagenomes and metadata 


http://www.genomesonline. 

org/ 


Table 15.11 

List of important metabolomics anaiysis tools 




Name 


Description 


Web address 


The Human 
Metabolome 
Database 


Information about small molecule metabolites 
found in the human body 


http :/ /www. hmdb . ca/ 


Fiehn 

Laboratory 


Identification and quantification of all 
metabolites in a given biological situation 


http://fiehnlab.ucdavis.edu/ 


MMCD 


High-throughput NMR and MS approaches to 
the identification and quantification of 
metabolites present in biological samples 


http://mmcd.nmrfam.wisc. 

edu/ 


MeT-RO 


Establish a critical mass of resources that can be 
applied to plant and microbial metabolomics 


http://www.metabolomics. 
bbsrc.ac.uk/MeT- RO . htm 


PRiMe 


Metabolomics and Transcriptomics as systems 
for understanding life 


http : / /prime .psc.riken .jp / 


METLIN 


A repository of metabolite information as well 
as tandem mass spectrometry data 


http://metlin.scripps.edu/ 


MZmine 


Mass spectrometry data processing, with 
the main focus on LC-MS data 


http:/ / mzmine.sourceforge. 
net/index . shtml 


MathDAMP 


The visualization of differences between metabolite 
profiles acquired by hyphenated mass 
spectrometry techniques 


http://mathdamp.iab.keio. 

ac.jp/ 


COMSPARI 


Eacilitate the analysis of “paired” samples 


http ://www. biomechanic, 
org/ comspari / 
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Molecular Phylogenetics of Microbes 

Surajit Das and Hirak Ranjan Dash 



Abstract 

Molecular phylogenetics or molecular systematic is the use of molecular structure to gain information on 
an organism’s evolutionary relationships which is expressed as a phylogenetic tree. The impact of 
molecular systematic on bacterial classification has been profound. In the year 1977, Woesian revolution 
occurred when Carl Woese, a chemist working on relative isolation compared 16S rRNA sequences to 
study the classification of microorganisms. This molecular approach revealed three (Archea, Bacteria, 
Eukarya), rather than five (Animalia, Plantae, Fungi, Monera, Protists) primary divisions of life to describe 
extraordinary levels of microbial diversity. Through cells different apparatus provides numerous informa- 
tion related to an organism but SSU rDNAs (genes coding for small subunit ribosomal RNA) offer a 
quality and quantity of information which make them one of the most useful macromolecular descriptors 
of microorganisms. However, 16S rRNA sequence analysis has been criticized as some of the cases of 
lateral gene transfer reported in those genes. To avoid this ambiguity along with the analysis of 16S rRNA 
gene, certain housekeeping genes \.ike.^apA,^roEL,£yrA, ompA, p^i are also recommended. The targeted 
genes can be analyzed in various ways either by documenting the gel in which the amplified products run 
or by analyzing the sequences of the targeted genes of interests. This chapter will focus on various 
molecular techniques involving gel-based techniques, sequence-based techniques, analyzing Dendro- 
gram, and Cladograms. 



16.1 Introduction 



Molecular phylogenetics or molecular systematic is the use of 
molecular tools to gain information on an organism’s evolution- 
ary relationships which is expressed as a phylogenetic tree. The 
impact of molecular systematic on bacterial classification has been 
profound. In the year 1977, Woesian revolution occurred when 
Carl Woese, a chemist worldng on relative isolation compared 16S 
rRNA sequences to study the classification of microorganisms. 
This molecular approach revealed three (Archea, Bacteria, 
Eukarya), rather than five (Animalia, Plantae, Fungi, Monera, 
Protists) primary divisions of life to describe extraordinary levels 
of microbial diversity. Though cells different apparatus provides 
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numerous information related to an organism but SSU rDNAs 
(genes coding for small subunit ribosomal RNA) offer a quality 
and quantity of information which make them one of the most 
useful macromolecular descriptors of microorganisms. SSU 
rDNAs are widely used as informative biomarkers as: 

• They are essential components of protein synthesis machinery 
and therefore are ubiquitously distributed and functionally 
conserved in all organisms. 

• They lack the interspecies horizontal gene transfer found with 
many prokaryotic genes. 

• They are readily isolated and identified. 

• They contain diagnostic variable regions interspersed among 
highly conserved regions of primary and secondary structures, 
permitting phylogenetic comparisons to be inferred over a 
broad range of evolutionary distance. 

However, 16S rRNA sequence analysis has been criticized as 
some of the cases of lateral gene transfer reported in those genes 

[1] . To avoid this ambiguity along with the analysis of 16S rRNA 

gene, certain housekeeping genes like groEL^ ompA^ 

p£fi are also recommended. The targeted genes can be analyzed in 
various ways either by documenting the gel in which the amplified 
products run or by analyzing the sequences of the targeted genes 
of interests. 

Phylogenetic analysis is also known as molecular taxonomy. 
It uses the representation of evolutionary information in the 
form of phylogenetic trees. There are several methods for con- 
structing phylogenetic tree. One of the most popular tools is 
PHYLIP (PHYLogeny Inference Package). The tree of life 
(http://evolution.genetics.washington.edu/phylip.html) is a 
collaborative Internet project containing information about phy- 
logeny and biodiversity. 

Certain softwares come in to play to draw the dendrograms to 
show the evolutionary history of the organism based upon the 
sequence variation of the gene of interest. Thus, to study molecu- 
lar phylogeny of an organism, certain techniques like gel-based 
and sequence-based techniques are used followed by the drawing 
of dendrogram and cladogram to Icnow the phylogeny and relat- 
edness among organisms. 

The genetic fingerprinting technique called denaturing gradi- 
ent gel electrophoresis (DGGE) of PGR- amplified ribosomal 
DNA for microbial typing first came to light in the year 1993 

[2] . In DGGE, DNA fragments of the same length but with 
different sequences can be separated. The polymerase chain reac- 
tion of environmental DNA can generate templates of differing 
DNA sequence that represent many of the dominant microbial 
organisms. However, since PGR products from a given reaction 
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are of similar size (bp), conventional separation by agarose gel 
electrophoresis results only in a single DNA band that is largely 
non-descriptive. DGGE can overcome this limitation by separat- 
ing PGR products based on sequence differences that results in 
differential denaturing characteristics of the DNA. Once gener- 
ated, fingerprints can be uploaded into databases in which finger- 
print similarity can be assessed to determine microbial structural 
differences between environments or among treatments. Further- 
more, with the breadth of PGR primers available, DGGE can also 
be used to investigate broad phylogenies or specific target organ- 
isms such as pathogens or xenobiotics degraders. In case of 
DGGE the separation is based on the decreased electrophoretic 
mobility of a partially melted double stranded DNA molecule in 
polyacrylamide gels containing a linear gradient of DNA denatur- 
ants (a mixture of urea and formamide). The percentage of GC 
content of the sequence plays a role during separation. As the GC 
bond pair is stronger than that of AT bond pair, the sequence with 
high GC content resists to denaturation and travels a longer 
distance where as the sequence with lower GC content denatures 
faster and cannot migrate for longer distance. As a result of which 
a specific banding pattern is observed for a same amplified product 
with sequence variation. It is highly useful for the estimation of 
microbial diversity of an environment by metagenomic approach. 

Amplified rDNA restriction analysis is the extension of the 
technique RFTP to the gene encoding small ribosomal subunit of 
bacteria. The technique was originally developed by [3] to charac- 
terize Mycobacterium species. However, the method has been used 
by many more for characterization of other bacterial species also. 
Analysis of the patterns is done with methods used for RAPD 
patterns. Clusters of related bacteria can be represented in the 
form of a cardiogram or hologram. The data can be subsequently 
used for generating a phylogram or cladogram which can be used 
to plot a phylogenetic tree that would indicate the relationship of 
the organisms based on the restriction pattern obtained from their 
respective I6S genes. It is assumed that the related organisms give 
same restriction pattern by digestion with a particular restriction 
enzyme. The frequency of a random occurrence of a restriction 
site, a 4 bp sequence can occur as once in 256 bp repeats. When a 
tetra cutter restriction enzyme is applied along with I6S amplified 
sequence, it digests the sequence at particular sites to give a 
particular restriction pattern which is Icnown to be the signature 
of the organism. The restriction pattern of one organism can be 
compared with the other to Icnow the phylogenetic relationship 
between them. However for the sake of statistical significance, at 
least three restriction enzymes should be used to overcome the 
probability of certain restriction enzymes to yield similar patterns 
for unrelated organisms. 
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Restriction fragment length polymorphism (RFLP) is a 
generalized technique applied for phylogeny and fingerprinting 
analysis of bacteria based on the use of restriction enzymes which 
cut DNA at specific sequences. It is assumed that genetic related 
strains have similar distribution of restriction sites in their 
genome. But there are certain disadvantages like pattern complex- 
ity occur during RFLP as there is generation of too many number 
of fragments when any common 4-6 cutter restriction enzyme cut 
the genomic DNA. In order to avoid this problem, PCR RFLP 
should be performed for bacterial strains with particular charac- 
teristics like heavy metal resistance, antibiotic resistance, etc. 

Polymerase chain reaction-based analysis of I6S rRNA genes 
is a powerful and essential tool for studies of bacterial diversity, 
community structure, evolution, and taxonomy. It enables us to 
detect and identify cultivable bacteria, as-yet uncultivable bacteria 
and in recent years it has led to an enormous increase in our 
knowledge of bacterial taxonomy. The rRNA is the most con- 
served gene of all cells. Portions of the rDNA sequences from 
distantly related organisms are remarkably similar. Hence that 
sequences from distantly related organisms can be precisely 
aligned and the differences can be truly measured. Hence genes 
that encode the rRNA are used extensively to determine taxon- 
omy, phylogeny, and to estimate rates of species divergence 
among bacteria. Thus the comparison of I6S rDNA sequences 
can show evolutionary relatedness among microorganisms. This 
work has been pioneered by Carl Woese. 

rpoB is a bacterial housekeeping gene that code for part of an 
enzyme which synthesizes RNAi.e., it is the (3 subunit of bacterial 
RNA polymerase. Rpo(3 is a highly conserved enzyme contained 
by many bacteria. The complete rpoB gene sequence length varies 
from 3,452 to 3,845 bp which exhibits interspecies homology and 
intraspecies divergence [4]. Thus it can be used as an alternative 
tool for identification and subsequently for the analysis of phylog- 
eny. Though nowadays the works clearly illustrate the usefulness of 
these sequences for enterobacteriaceae family but it can be applied 
to almost all types of organisms including bacteria and archea [5]. 
Though a house keeping gene rpoB is also prone to mutations in 
case of bacteria. Distinct nucleotide substitutions in the sequences 
can lead a bacterium to confer resistance towards rifampin [6]. 
rpoB^ the gene encoding the highly conserved subunit of the 
bacterial RNA polymerase, has been demonstrated to be a suitable 
target based on which the identification of enteric bacteria, spir- 
ochetes, bartonellas, and rickettsias are done [7]. The gene has 
been shown to be more discriminative than the I6S ribosomal 
DNA (rDNA). 

^yrA is another bacterial housekeeping gene that codes for A 
subunit of DNA gyrase i.e., type II Topoisomerase. This is a gene 
of importance as it is of a 5-9 kb region which includes part of 
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an upstream recPgene, the whole^^^ri? and^yrA and about 1 kb of 
unknown downstream sequences. Transcription of ^yrA gene 
increases in response to DNA relaxation. The most important 
point in relation to £iyrA is that, resistance to quinolones in 
microorganisms is related to the acquisition of point mutation in 
the sequence of the QRDR of the for most microorganisms. 

In these microorganisms, the mutations in the gene encoding the 
subunit A of the DNA gyrase can confer quinolone resistance and 
overexpression of efflux pump may play a complementary role in 
quinolone resistance acquisition [8]. 

The differences in QRDR sequences of the ^yrA gene allow 
differentiation of species by PCR-Restriction Fragment Length 
Polymorphism analysis using Ncol restriction enzyme [9]. This 
enzyme has different restriction sites in the sequence of gyrA gene 
depending upon the fact that to which organism the gyrA gene 
belongs to. However, the level of quinolone resistance seems to 
depend upon the type of amino acid substitution and on the 
amino acid substituted. Keeping in mind the selective variations 
in the sequences of gyrA gene the sequences of this gene can be 
compared with each other to draw the phylogenetic tree among 
them. However, this sequence-based housekeeping gene has been 
restricted to most pathogenic strains having characteristics fea- 
tures of quinolone resistance and mostly applicable for Staphylo- 
coccus^ Streptococcus, Helicobacter, and Corynebacterium [10]. 

The sequencing of DNA and proteins has become easy and 
fast with the use of automated tools. When a set of sequences are 
present, the evolutionary relationships among genes and among 
organisms can be constructed. A phylogeny illustrates the rela- 
tionships between the sequences. Analysis of phylogeny of a family 
(of related nucleic acid or protein sequences) is done by determi- 
nation of how the family might have been derived during evolu- 
tion. Molecular evolution is based on mutations impacting the 
DNA to change. Mutations can occur when there are errors in 
DNA replication or repair. There are various models of DNA 
change. The selection of model is one of the fundamental deci- 
sions to conduct the phylogenetic analysis. 

The number or types of changes in the residues of a Multilocus 
Sequence Alignment (MSA) can be used to start a phylogenetic 
analysis. Each column in the MSA denotes mutations that occur at 
one site during the evolution of the sequence family. This informa- 
tion can be used to evaluate the positions in the sequences which are 
conserved and which diverge from a common ancestor sequence. 

Evolutionary relationships can be represented using phyloge- 
netic trees. A tree is a 2D graph showing evolutionary relationsliips 
among organisms. The tree is composed of nodes (a point where 
branches bifurcate) representing the taxa and branches representing 
the relationships among the taxa. The lengths of the branches are 
often drawn proportional to the number of sequence changes in the 
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16.2 Materials 

16.2.1. DGGE 



branch and hence can represent the divergence. Two sequences that 
are very much alike will be located as neighboring outside branches 
and would be joined to a common branch beneath them. Usually 
phylogenetic analysis methods assume that each position in the 
protein or nucleic acid sequence changes independently of others. 
A clade is a group of organisms whose members share homologous 
features derived from a common ancestor. There are various pro- 
grams available for performing various phylogenetic operations. 
Different programs and program options are different for DNA 
and protein sequences. Some of the most popular are PlyloBLAST, 
Phylip, PAUP, PAML, Clustal W, ClustalX, etc. 

In a dendrogram, the trees are constructed by similarities of 
sequences which do not necessarily reflect evolutionary relation- 
ships. Distance methods also called phonetic methods compress 
all of the individual differences between pairs of sequences into a 
single number. Character- based methods are also called cladistic 
methods. The trees are calculated by considering the various 
possible pathways of evolution and are based on parsimony or 
likelihood methods. The resulting tree is called a cladogram. 
Cladistic methods use each alignment position as evolutionary 
information to build a tree. 

There are two major types of cladistic methods based on 
Parsimony and based on Maximum likelihood. In the parsimony 
methods, for each position in the alignment, all possible trees are 
evaluated and are given a score based on the number of evolution- 
ary changes needed to produce the observed sequence changes. 
The most parsimonious tree is the one with the fewest evolution- 
ary changes for all sequences to deprive from a common ancestor. 
This is a more time-consuming method than the distance meth- 
ods. The maximum likelihood method also uses each position in 
an alignment. It evaluates all possible trees and calculates the 
likelihood for each tree using an explicit model of evolution. 
The likelihoods for each aligned position are then multiplied to 
provide the composite likelihood for each tree. The tree with the 
maximum likelihood is the most possible tree. This is the slowest 
method of all but may give the best result and the most informa- 
tion about the tree. There are various computer based softwares 
for such analysis like PHYLIP DNAPARS and fastDNAml. 



• Bacterial genomic DNA 

• I6S rDNA forward primers (GC33IF) [CGC CCG CGC 
GCG GCG GGC GGG GCG GGG GCG CGG GGG GTC 
CTA CGG GAG GCA GCA GT] 
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• 16S rDNA reverse primers (797R) [GGA CTA CCA GGG 
TAT CTA ATC CTG TT] 

• dNTPs 

• Taq DNA polymerase 

• PCR reaction buffer 

• PCR tubes and tips 

Materials required for gradient polyacrylamide gel electrophoresis. 

16.2.2. ARDRA • Genomic DNA 

• 1 6S forward [AGA GTT TGA TCC TGG CTC AG] primer 

• 16S reverse [ACG GCT ACC TTG TTA CCA CTT] primer 

• dNTPs 

• Taq DNA polymerase 

• PCR reaction buffer 

• PCR tubes and tips 

• Restriction enzymes (tetra cutters) 

• Gel apparatus and reagents 

16.2.3. 16S rRNA • Genomic DNA 

Gene Sequencing , p(^j^ buffer 

• Magnesium chloride 

• dNTPs 

• Universal forward and reverse primers 

• Taq DNA polymerase 

• Automatic DNA sequencer 

• NCBI database 

16.2.4. rpoB Gene • Genomic DNA 

Sequencing , buffer 

• Magnesium chloride 

• dNTPs 

• rpoB forward [2643F: 5'CAA TTC ATC GAC CAA GC 3'] 
primer 

• rpo B reverse [3241R: 5' GCI ACI TGI TCC ATA GCT GT 
3'] primer 

• Taq DNA polymerase 

• DNA sequencer 

• rpoB database 
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16.2.5. gyrA Gene 
Sequencing 



16.3 Methods 

16.3.1. DGGE 

16.3.1.1. PCR 



16.3.1.2. Sample 
Preparation 



16.3. 1.3. Preheating of 
the Loading Buffer 



• Genomic DNA 

• PCR buffer 

• Magnesium chloride 

• dNTPs 

• Forward [5' GCG GCT ACG TAA AGT CC] primer 

• syrA Reverse [5' GCG CCG GAG CCG TTC AT 3'] primer 

• Taq DNA polymerase 

• DNA sequencer 

• £iyrA Database 



Reaction mixture 


Cycling conditions 






10 X buffer- 10 pi 


Denaturation-94 °C for 5 min 






10 mM dNTPs 
mixture-2.5 pi 


Denaturation-94 °C for 30 s 


"N 




5 pM forward 
primer- 10 pi 


Annealing- 5 5 “C for 30 s 




^ 30 
Cycles 


5 pM reverse 
primer- 10 pi 


Extension-72 °C for 2 min 


J 




2.5 U/pl of Taq 
polymerase- 1 pi 


Extension-72 °C for 5 min 






Water- 39 pi 


Holding-4 °C forever 






Template DNA-2 pi 



• Load 50 pi of the amplified DNA per well. 

• Prior to loading add equal volume of 2 x gel loading dye to 
the sample. 

• Fill the electrophoresis tank with 7 L of lx TAE running 
buffer. 

• Set the temperature to 60 °C and the temperature ramp rate 
to 200 °C/h. 



16.3.1.4. Casting 
Denaturing Gradient Get 



16.3. 1.5. Running the Get 



16.3.1.6. Staining 
and Documenting 
the Get 



16.3.1.7. DGGE Banding 
Patterns and Statisticat 
Analysis 
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• Prepare low density solution and high density solution. 

• Add the 0.09 % (v/v) each of ammonium persulfate and 
TEMED solutions. Mix by inverting several times. 

• Draw both high and low density solutions to respective syrin- 
ges; care should be taken so that there will be no air bubble in 
the syringes. 

• Place the syringes in to the gradient delivery system syringe 
holder. Rotate the wheel clockwise slowly to deliver the gel 
solution. 

• Carefully insert the comb to the desired well depth. It will take 
about 30-60 min to polymerize the gel. 

• After polymerization remove the comb by pulling it straight 
up slowly and gently. 

• Load the sample DNA prepared earlier and continue electro- 
phoresis by assembling the electrophoresis apparatus. 

• Attach the electrical leads to a suitable DC power supply 
supplied by the system. 

• Run the gel at 130 V for overnight. 

• Remove the gel from the glass plate. 

• Place the gel in to a dish containing 250 ml of running buffer 
and 25 pi of 10 mg/ml ethidium bromide. Stain for 
5-15 min. 

• After staining carefully transfer the gel into a dish containing 
250 ml of running buffer. Destain for 5-20 min. 

• Photograph the system in Gel documentation system. 

The digitized DGGE images were analyzed using Gel Pro 
(Bio-Rad). We combined four images of a DGGE gel using 
Adobe Photoshop CS4 (Adobe Systems Inc., USA) to amend 
the relative mobility between different gels. We developed a binary 
matrix containing data on the presence/absence of bands and a 
proportional matrix displaying the percentage of each band based 
on relative pixel intensities for each lane. 

The Shannon- Weaver index (iTp) was calculated using the 
function Hp = —l,Pi InPf, where Pi is the importance probability 
of the bands in a gel lane [11] and Pi = ni/N, where ni is the 
intensity of the individual bands and Nis the sum of the intensities 
for all bands in a lane. We generated an unweighted pair-group 
method with an arithmetic mean (UPGMA) tree using Ntsys, 
based on the band intensity matrix, to examine the similarities in 
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16.3.1.8. Phylogenetic 
Analysis of the Sequenced 
DGGE Bands 



16.3.2. ARDRA 

16.3.2.1. PCRfor 
Amplification of 16S rDNA 
Sequence 



16.3.2.2. Restriction 
Digestion 



16.3.2.3. Electrophoresis 



the DGGE profiles among all 40 samples, [12]. Correlations 
between the DGGE profiles and sediment physicochemical prop- 
erties were analyzed by CCA using Canoco (Microcomputer 
Power, USA) [13]. The statistical significance of the relationship 
was assessed by Monte Carlo analysis using 1,000 permutations. 

Analyze the fingerprinting pattern and determine the phyloge- 
netic relationship among the tested bacterial strains. Then con- 
struct neighbor-joining [14] tree using MEGA 4.1 [15]. The 
bootstrap values were replicated 1,000 times. 



Reaction mixture 


Cycling conditions 






10 X buffer- 5 pi 


Denaturation-94 °C for 5 min 






10 mM dNTPs 
mixture- 1 pi 


Denaturation-94 °C for 30 s 


"N 




1 0 pM forward 
primer- 1 pi 


Annealing- 5 5 °C for 30 s 




^ 30 
Cycles 


1 0 pM reverse 
primer- 1 pi 


Extension- 72 °C for 2 min 


J 




2.5 U/plofTaq 
polymerase- 1 pi 


Extension-72 °C for 5 min 






Water- 39 pi 


Holding-4 °C forever 






Template DNA-2 pi 



• 10 pi of 16S rDNA amplified product was taken in a micro 
centrifuge tube. 

• It was digested with 5 U of restriction enzyme Alul and 
restriction buffer is added to make a total volume of 20 pi. 

• It was incubated for 2 h at 37 °C for restriction digestion. 

• Run the restriction digested product on 3 % agarose at 130 V 
for 3 h. 

• Analyze the banding pattern with Gel Documentation System 
and determine the phylogenetic relationship. 
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16.3.3. 16S rRNA Gene 
Sequencing 

16.3.3.1. WSrRNA Gene 
Amplifieation 



16.3.3.2. Purification of 
PCR Product 

16.3.3.3. Sequencing 



16.3.3.4. BLAST 



16.3.4. rpoB Gene 
Sequencing 

16.3.4.1. rpoB Gene 
Amplification 



Reaction mixture 


Cyciing conditions 






lOx buffer- 5 |il 


Denaturation-94 °C for 5 min 






10 mM dNTPs 
mixture- 1 [tl 


Denaturation-94 °C for 30 s 






1 0 (iM forward 
primer- 1 |il 


Annealing-55 °C for 30 s 




^ 30 
Cycles 


10 pM reverse 
primer- 1 pi 


Extension-72 °C for 2 min 


J 




2.5 U/plofTaq 
polymerase- 1 pi 


Extension- 72 °C for 5 min 






Water- 39 pi 


Holding-4 °C forever 






Template DNA-2 pi 



Purify the PCR product either by readily available gel elution Idts 
or by PCR purification lots. 

The purified PCR product is sequenced either by dideoxy chain 
termination method or chemical method or by pyrosequencing. 

The sequences obtained can be compared with the databases like 
NCBI, expasy to Icnow the percentage of similarity basing upon 
which the phylogenetic relationship of the organism with others 
can be determined. 



Reaction mixture 


Cycling conditions 






10 X buffer- 5 pi 


Denaturation-94 °C for 
5 min 






10 mM dNTPs 
mixture- 1 pi 


Denaturation-94 °C for 30 s 


"N 


> 30 
Cycles 


1 0 pM forward 
primer- 1 pi 


Annealing-5 °C for 30 s 




1 0 pM reverse primer- 
Ipl 


Extension-72 °C for 2 min 






2.5 U/pl of Taq 
polymerase- 1 pi 


Extension-72 °C for 5 min 







(continued) 
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Reaction mixture Cycling conditions 



Water- 39 nl 


Holding-4 °C forever 


Template DNA-2 pi 





16.3.4.2. Purification 
of PCR Product 

16.3.4.3. Sequencing 



16.3.4.4. BLAST 



16.3.5. gyrA Gene 
Sequencing 

16.3.5.1. gyrA Gene 
Amplification 



16.3.5.2. Purification 
of PCR Product 

16.3.5.3. Sequencing 



16.3.5.4. BLAST 



Purity the PCR product either by readily available gel elution Idts 
or by PCR purification lots. 

The purified PCR product is sequenced either by dideoxy chain 
termination method or chemical method or by pyrosequencing. 

The sequences obtained can be compared with the databases like 
NCBI, ExPASy to know the percentage of similarity basing upon 
which the phylogenetic relationship of the organism with others 
can be determined. 



Reaction mixture 


Cycling conditions 






lOx buffer- 5 pi 


Denaturation-94 °C for 5 min 






10 mM dNTPs 
mixture- 1 pi 


Denaturation-94 °C for 30 s 






1 0 pM forward 
primer- 1 pi 


Annealing-55 °C for 30 s 




^ 30 
Cycles 


1 0 pM reverse 
primer- 1 pi 


Extension- 72 °C for 2 min 


J 




2.5 U/plofTaq 
polymerase- 1 pi 


Extension- 72 °C for 5 min 






Water- 39 pi 


Holding-4 °C forever 






Template DNA-2 pi 



Purify the PCR product either by readily available gel elution Idts 
or by PCR purification Idts Gen elute®, Sigma-Aldrich. 

The purified PCR product is sequenced either by dideoxy chain 
termination method or chemical method or by pyrosequencing by 
using automated DNA sequencers. 

BLAST (Basic Local Alignment Search Tool) is a similarity search 
program developed at NCBI (http://www.ncbi.nlm.nih.gov/ 
BLAST/). It is available as a free service over the internet that 
provides very fast, accurate and sensitive database searching. 
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Table 16.1 

URL address of some commonly used databases 



Name of the program 


Address (URL) 


BLAST Network Service on ExPASy 


http:/ / us.expasy.org/ tools/blast/ 


BLAST at EMBnet-CH/SIB 
(Switzerland) 


http://www.ch.embent.org/software/Bottom BLAST.html? 


BLAST at NCBI 


http:/ /www.ncbi. nlm.nih.gov/BLAST / 


WU-BLAST at the EBI 


http://www.ebi.ac.uk/blast2/ 


BLAST at PBIL (Lyon) 


http : / / npsa-pbil .ibcp.fr/ cgi- bin /npsa_automat .pL> 
page=npsa_blast.html 



BLAST uses a heuristic algorithm that seeks local as opposed to 
global alignments and is therefore able to detect relationships 
among sequences that share only isolated regions of similarity. 
The sequences obtained can be compared with the databases like 
NCBI, ExPASy to know the percentage of similarity basing upon 
which the phylogenetic relationship of the organism with others 
can be determined. Some of the most used databases are as follows 
(Table 16.1): 



16.3.6. ClustalX 

16.3.6.1. Obtaining a We are already having a particular nucleic acid and protein 

Reiated Sequence by a sequence of our interest and we need to find other sequences 

BLAST Search that are related to it, i.e., another sequence sufficiently similar to 

the sequence of interest so that we believe, the two sequences 
share a common ancestor. This may be done by the following 
steps: 

• Go to http://www.ncbi.nlm.nih.gov/cgi-bin/BLAST/nph- 
newblast.^ J form = 0 

• Open the electronic file containing your sequence of interest 
and copy the sequence. 

• Return to the BLAST page and click the text box below where 
it shows Sequence in FASTA format. 

• In this configuration type the accession number for the 
sequence of interest into the text box. 

• Click the Submit Query button to submit the sequence to 
BLAST. 

• At the left of each sequence is the name and accession number 
of the file, which is in one of the databases that has been 
searched, in the middle is a brief description of the file. 
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16.3.6.2. Creating 
Multiple Sequenee 
Alignment 



• Bit Score — For DNA sequences, the bit score is calculated by 
assigning a score of 1 to each match and 0 to each mismatch, 
then subtracting penalties for gaps. The higher the bit score, 
the more closely related that sequence to the sequence of 
interest. 

• At the far right the number is the E value, which is the number 
of such matches to the current non-redundant sequences 
database that are expected by chance alone. The smaller the 
E value, the more likely that the similarity is real (This is a 
critical choice because that list will be the set of sequences 
from which we construct our phylogenetic tree). 

• Downloading the selected sequence, click the FASTA button 
and the same file will be saved in the FASTA format. 

A pair of sequences can be aligned by writing one above the other 
in such a way as to maximize the number of residues (nucleotides 
or aminoacids) that match by introducing gaps (spaces) into one 
or other sequences. Biologically those gaps are assumed to repre- 
sent insertions or deletions that occurred as the sequences 
diverged from a common ancestor. A scoring system is used so 
that matching residues get some sort of positive numerical score, 
and gaps get sort of negative score, or gap penalty. An alignment 
program seeks an arrangement that maximizes the net score. Gap 
penalties are typically set by the user. ClustalX is an updated 
version of clustalW which is one of the best tools for creating 
multiple alignments. Follows the steps for alignment: 

• Create your input file in FASTA format 

• Copy the entire sequence file, including the first line, and 
paste it in to the CelF. in file. In, clustal treats everything 
between “>” and the first space as the sequence name 

• Save the file in plain text or ASCII format, this is important 
because ClustalX will not recognize word or any other word 
processor file 

• Go to http://www.biozentrum.unibas.ch/biophit/clustal/ 
ClustalX_help . html 

• Pull down the File menu and choose the Load sequence menu 
item 

• Pull the Alignment menu down to choose the Alignment 
parameters (pairwise alignments: slow to accurate. Gap open- 
ing Penalty 15.00 and Gap Extension Penalty to 0.30, Delay 
Divergent sequences 25 %, Output format options to Nexus 
and change the Clustal Sequence Numbers on) 

• Choose do complete alignment under the alignment menu 

• After alignment is over Eliminate truncated sequences 
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• Delete nonhomologous regions from the sequences 

• Change the gap penalties 

• Use PAUP to create a tree 

16.3.6.3. Creating Tree The phylogenetic tree can be drawn by the following steps: 

• Open http://paup.csit.fsu.edu/ 

• Create the input file in Nexus format 

• Pull down the Analysis menu and choose distance 

• Using same Analysis menu, choose Neighbor joining/UPGMA 

• Accept the computer generated number and click OK 

• Click the Preview button and you can see a slanted phyloge- 
netic tree which can be printed using print tree. 

16.3.7. Cladogram Obtain nexus hie by multiple sequence alignment and follow the 

steps to draw cladogram: 

• Open PAUP link 

• Pull down the Alignment menu, be sure that Parsimony is 
checked 

• Choose Heuristic Search 

• In the resulting dialogue leave everything in its default state 

• Click the Search button 

When search is completed it will show a close button and will 
indicate the number of trees that were created. The trees are now in 
memory and it can be printed by selecting print trees from the Trees 
menu or can be view by TreeView ( http ://taxonomy. zoology, 
gla.ac.uk/rod/treeview.html) software. 
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Microarray Technology: Basic Concept, Protocols, 
and Applications 

P.P. Dubey and Dinesh Kumar 

Abstract 

Microarray is one such technology which enables the researchers to investigate and address issues which 
were once thought to be nontraceable. Over the past few years, this powerful technology has been used to 
explore transcriptional profiles and genome differences for a variety of microorganisms, greatly facilitating 
our understanding of microbial metabolism. With the increasing availability of complete microbial 
genomes, DNA microarrays are becoming a common tool in many areas of microbial research, including 
microbial physiology, pathogenesis, epidemiology, ecology, phylogeny, and pathway engineering. One can 
analyze the expression of many genes in a single reaction quickly and in an efficient manner. DNA 
Microarray technology helps in the identification of new genes and in knowing about their functioning 
and expression levels under different conditions, comparative genomics, and SNPs identification. This 
chapter has outlined the principle of microarray technology, types of microarray, their basic protocols, and 
applications. 



17.1 Introduction 



Microarray technology evolved from Southern blotting, where 
fragmented DNA is attached to a substrate and then probed 
with a known gene or fragment. The use of a collection of distinct 
DNAs in arrays for expression profiling was first described in 
1987, and the arrayed DNAs were used to identify genes whose 
expression is modulated by interferon. These early gene arrays 
were made by spotting cDNAs onto filter paper with a pin-spot- 
ting device. The use of miniaturized microarrays for gene expres- 
sion profiling was first reported in 1995, and a complete 
eukaryotic genome {Saccharomyces cerevisiae) on a microarray 
was published in 1997. It is known that thousand of genes and 
their products (i.e., RNA and Proteins) in a given living organism 
function in a complex and orchestrated way that create the mys- 
tery of life. However, traditional methods in molecular biology 
generally work on one gene in one experiment basis, which means 
that the throughput is very limited and it is very hard to know the 

D.K. Arora et al. (eds.), Analyzing Microbes, Springer Protocols, 

DOI 10.1007/978-3-642-34410-7_17, © Springer-Verlag Berlin Heidelberg 2013 

261 





262 Dubey and Kumar 



17.1.1. Principle 



complete picture of gene function in an organism. The invention 
of polymerase chain reaction (PCR) produced a surge in new 
experiments. Because PCR accelerated and simplified the proce- 
dures previously performed much more laboriously by traditional 
molecular cloning, it quicldy found use in experimental molecular 
biology. In the past several years, a new technology, called DNA 
microarray, has attracted tremendous interest among biologists. 
DNA microarrays are a powerful tool for the investigation of 
various aspects of prokaryotic biology because they allow the 
simultaneous monitoring of the expression of all genes in any 
bacterium. They offer a more holistic approach to study cellular 
physiology and therefore complement the traditional “gene-by- 
gene” approaches [ 1 ] . The term DNA microarray was coined in 
publications from the laboratory of DeRisi [2] and Schena [3]. 
This technology promises to monitor the whole genome in single 
chip so that the researcher can have a better picture of the inter- 
actions among thousands of genes simultaneously. Molecular 
biology research evolves through the development of the technol- 
ogies used for carrying them out. It is not possible to research on a 
large number of genes using traditional methods. DNA micro- 
array technology has empowered the scientific community to 
understand the fundamental aspects underlining the growth and 
development of life as well as to explore the genetic causes of 
anomalies occurring in the functioning of the human body. 

A typical microarray experiment involves the hybridization of an 
mRNA molecule to the DNA template from which it is originated. 
Many DNA samples are used to construct an array. The amount of 
mRNA bound to each site on the array indicates the expression 
level of the various genes. This number may run in thousands. All 
the data are collected and a profile is generated for gene expression 
in the cell. Additional advantages of microarrays are that they are 
highly sensitive and are small. The microarray constitutes a large 
array of highly ordered immobilized target sequences attached to a 
solid surface. Each target sequence corresponds to a different 
gene; mRNA is taken from a particular cell line or tissue combined 
with some sort of marker to generate a labeled sample. This sample 
is then hybridized onto the target sequences of the microarray. The 
marked sample will bind with its complementary sequence so for 
each gene the amount of marker is detected and this provides a 
level of expression for that gene. Due to the large number of 
measurements that are taken simultaneously this produces a huge 
amount of data. Various techniques have been developed to con- 
structively deal with the size and variability of microarray data. 

An array is an orderly arrangement of samples where matching 
of known and unloiown DNA samples is done based on base 
pairing rules. An array experiment makes use of common assay 
systems such as microplates or standard blotting membranes. 
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Fig. 17.1 Example of an approximately 40,000 probe spotted Oligo microarray. 



The sample spot sizes are typically less than 200 pm in diameter 
and usually contain thousands of spots (Fig. 17.1). Thousands of 
spotted samples known as probes (with known identity) are 
immobilized on a solid support (microscope glass slides or silicon 
chips or nylon membrane). The spots can be DNA, cDNA, or 
oligonucleotides. These are used to determine complementary 
binding of the unlcnown sequences, thus allowing parallel analysis 
for gene expression and gene discovery. An experiment with a 
single DNA chip can provide information on thousands of genes 
simultaneously. An orderly arrangement of the probes on the 
support is important as the location of each spot on the array is 
used for the identification of a gene. 

17.1.2. Types of Depending upon the land of immobilized sample used to con- 

Microarrays struct arrays and the information fetched, the Microarray experi- 

ments can be categorized in three ways: 

(a) Microarray expression analysis: In this experimental setup, 
the cDNA derived from the mRNA of known genes is immo- 
bilized. The sample has genes from both the normal and the 
diseased tissues. Spots with more intensity are obtained for 
diseased tissue gene if the gene is overexpressed in the dis- 
eased condition. This expression pattern is then compared to 
the expression pattern of a gene responsible for a disease. 

(b) Microarray for mutation analysis: For this analysis, the 
researchers use gDNA. The genes might differ from each 
other by as less as a single nucleotide base. A single base 
difference between two sequences is laiown as Single Nucle- 
otide Polymorphism (SNP) and detecting them is laiown as 
SNP detection. 
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17.1.3. Affymetrix 
GeneChips 



(c) Comparative genomic hybridization: It is used for the iden- 
tification in the increase or decrease of the important chro- 
mosomal fragments harboring genes involved in a disease. 

According to the nature of the probe, microarray can be 
classified as (a) double-stranded DNA microarrays and (b) oligo- 
nucleotide DNA microarrays. There are two major types of probes 
that are used with DNA microarray printers: double -stranded 
DNA and oligonucleotides. Double-stranded DNA commonly 
results from PCR amplification [4]. A 200 to 800bp length of 
amplified DNA is recommended, but larger fragments of up to 
l.Skb length also work [5]. In typical microarray design, each 
probe DNA corresponds to one gene. This represents the original 
type of DNA microarrays where cDNA molecules from Ambidop- 
sis thaliana were amplified by PCR and spotted. 

The most prominent microarrays with in situ synthesized probes 
are the GeneChips manufactured by Affymetrix, Santa Clara, CA, 
USA [6]. They are produced by chemical synthesis of the oligo- 
nucleotides directly on the coated quartz surface of the array [7]. 
This technology allows very high feature densities. It is typical to 
have 400,000 features on a commercial array [8]. Therefore, they 
are called high-density oligonucleotide arrays. GeneChips are 
produced in a unique photolithographic process analogous to 
the methods used for production of microelectronics chips in 
combination with chemical reactions developed for combinatorial 
chemistry. A quartz wafer is coated with a narrow layer of a light- 
sensitive compound. This coating prevents the covalent coupling 
of an activated nucleotide. Exposure to light causes the removal of 
the chemical protection groups from the surface. Subsequently 
applied reactive derivates of single nucleotides can then be cou- 
pled. The attached nucleotides again carry a light-sensitive pro- 
tection group that has to be removed by illumination before 
coupling the next nucleotide. Lithographic masks are used to 
block or transmit light onto specific features, thereby determining 
the order of nucleotide to be coupled to the growing oligonucleo- 
tides. In repeated cycles of maslting, light exposure, and coupling, 
oligonucleotides of 2 5 residues’ length are synthesized on the chip 
surface. As the specificity of a probe of 25 nucleotides may not be 
high enough, each probe (“match”) is accompanied by a negative 
control with a single differing base in the middle of the probe 
termed mismatch probe. Performance of probe and mismatch 
probe can therefore be used to detect and eliminate cross-hybri- 
dization. Probe and mismatch probe are called a probe pair. 
Usually, II-I5 probe pairs, called a probe set, are used to repre- 
sent a single gene. The very high feature density in this type of 
microarray enables the high number of controls. 
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Fig. 17.2 Steps in microarray experiment. 



17.1.4. Steps in The following steps are involved in microarray experiments 

Microarray Experiment (Fig. 17.2): 

1. The two samples to be compared (pairwise comparison) are 
grown/acquired, e.g., treated sample (case) and untreated 
sample (control). 

2. The nucleic acid of interest is purified: this can be all RNA for 
expression profiling, DNA for comparative hybridization, 
or DNA/RNA bound to a particular protein which is 
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immunopredpitated (ChIP-on-chip) for epigenetic or regula- 
tion studies. In this example total RNA is isolated (total as 
it is nuclear and cytoplasmic) by Guanidinium thiocyanate- 
phenol-chloroform extraction (e.g., Trizol) which isolates 
most RNA (whereas column methods have a cutoff of 200 
nucleotides) and if done correctly has a better purity. 

3. The purified RNA is analyzed for quality (by capillary electro- 
phoresis) and quantity (by using a nanodrop spectrometer); if 
enough material (>1 pg) is present the experiment can con- 
tinue. 

4. The labeled product is generated via reverse transcription and 
sometimes with an optional PCR amplification. The RNA is 
reverse transcribed with either poly-T primers, which amplify 
only mRNA, or random primers, which amplify all RNA 
which is mostly rRNA. In miRNA microarray an oligonucleo- 
tide is ligated to the purified small RNA (isolated with a 
fractionator) and then RT and amplified. The label is added 
either in the RT step or in an additional step after amplification 
if present. The sense that is labeled depends on the microarray, 
which means that if the label is added with the RT mix, the 
cDNA is on the template strand while the probes on the sense 
strand (unless they are negative controls). The label is typically 
fluorescent; only one machine uses radiolabels. The labeling 
can be direct (not used) or indirect which requires a coupling 
stage. The coupling stage can occur before hybridization 
(two-channel arrays) using aminoallyl-UTP and NHS 
amino -re active dyes (like cyanine dyes) or after (single-chan- 
nel arrays) using biotin and labeled streptavidin. The modified 
nucleotides (I aaUTP: 4 TTP mix) are added enzymatically at 
a lower rate compared to normal nucleotides, typically result- 
ing in I every 60 bases (measured with a spectrophotometer). 
The aaDNA is then purified with a column (using solution 
containing phosphate buffer as Tris contains amine groups). 
The aminoallyl group is an amine group on a long linker 
attached to the nucleobase, which reacts with a reactive dye. 
A dye flip is a type of replicate done to remove any Dye effects 
in two-channel dyes: one is labeled with cy3 and the other 
with cy5 and this is reversed in a different slide, e.g., in the 
presence of aminoallyl-UTP added in the RT mix. 

5 . The labeled samples are then mixed with a propriety hybridi- 
zation solution which may contain SDS, SSC, dextran sulfate, 
a blocking agent (such as COTI DNA, salmon sperm DNA, 
calf thymum DNA, PolyA, or PolyT), Denhardt’s solution, 
and formamine. 

6. This mix is denatured and added to a pin hole in a microarray, 
which can be a gene chip (holes in the back) or a glass 




17 Microarray Technology: Basic Concept, Protocols, and Applications 267 



17.2 Typical 
Protocols and 
Major Steps for 
Microarray 

17.2.1. Fabrication 



17.2. 1. 1. Slide Coating 



17.2.1.1.1. Materials, 
Reagents, and Solutions 



microarray which is bound by a cover, called a mixer contain- 
ing two pinholes and sealed with the slide at the perimeter. 

7. The holes are sealed and the microarray hybridized, either in a 
hyb oven, where the microarray is mixed by rotation, or in a 
mixer, where the microarray is mixed by alternating pressure 
at the pinholes. 

8. After an overnight hybridization, all nonspecific binding is 
washed off (SDS and SSC). 

9. The microarray is dried and scanned in a special machine 
where a laser excites the dye and a detector measures its 
emission. 

10. The image is gridded with a template and the intensities of the 
features (several pixels malce a feature) are quantified. 

11. The raw data are normalized: the simplest way is to subtract 
the background intensity and then divide the intensities 
making either the total intensity of the features on each chan- 
nel equal or the intensities of a reference gene and then the t- 
value for all the intensities is calculated. More sophisticated 
methods include z-ratio, loess and lowess regression, and 
RMA (robust multichip analysis) for Affymetrix chips (sin- 
gle-channel, silicon chip, in situ synthesized short oligonu- 
cleotides). 



This protocol describes the steps required to produce a cDNA 
microarray. Gene-specific DNAis produced by PCR amplification 
of purified template plasmid DNAs from cloned ESTs. The PCR 
product is purified by ethanol precipitation, thoroughly resus- 
pended in 3x SSC, and printed onto a poly-L-lysine-coated slide. 

Slides coated with poly-L-lysine have a surface that is both hydro- 
phobic and positively charged. The hydrophobic character of the 
surface minimizes spreading of the printed spots, and the charge 
appears to help position the DNA on the surface in a way that 
maltes cross-linldng more efficient. 

• Gold seal microscope slides (#3011, Becton Dickinson, 
Franklin Lake, NJ) 

• Ethanol (100 %) 

• Poly-L-lysine (#P8920, Sigma, St. Louis, MO) 
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17.2.2. Procedure 



• 50-slide stainless steel rack, #900401, and 50-slide glass tank, 
#900401 (Wheaton Science Products, Millville, NJ) 

• Sodium hydroxide 

• Stir plate 

• Stir bar 

• Platform shaker 

• 30-slide rack, #196, plastic, and 30 slide box, #195, plastic 
(Shandon Lipshaw, Pittsburgh, PA) 

• Sodium chloride 

• Potassium chloride 

• Sodium phosphate dibasic heptahydrate 

• Potassium phosphate monobasic 

• Autoclave 

• 0.2 mm Filter: nalgene 

• Centrifuge: Sorvall Super 20 

• Slide box (plastic with no paper or cork liners), (e.g., #60-6306-02, 
PGC Scientific, Gaithersburg, MD) 

• 1 1 glass beaker 

• 1 1 graduated cylinder 



1 M Sodium Borate (pH 8.0) 

• Dissolve 61.83 g of Boric acid in 900 ml of DEPC H20. 
Adjust the pH to 8.0 with 1 N NaOH. Bring volume up to 1 1. 
Sterilize with a 0.2 pm filter and store at room temperature. 



Cleaning Solution 

• H 2 O 400 ml 

• Ethanol 600 ml 

• NaOH 100 g. Dissolve NaOH in H 2 O. Add ethanol and stir 
until the solution clears. If the solution does not clear, add 
H 2 O until it does. 



Poly- L- Lysine Solution 

• Poly-L-lysine (0.1 % w/v) 35 ml 

• PBS 35 ml 

• H 2 O 280 ml 

1. Place slides into 50-slide racks and place racks in glass tanks 
with 500 ml of cleaning solution. Gold Seal Slides are highly 
recommended, as they have been found to have consistently 
low levels of autofluorescence. It is important to wear powder- 




17 Microarray Technology: Basic Concept, Protocols, and Applications 269 



17.2.3. Slide Blocking 



tree gloves when handling the slides. Change gloves fre- 
quently, as random contact with skin and surfaces transfers 
grease to the gloves. 

2. Place tanks on platform shaker for 2 h at 60 rpm. 

3. Pour out cleaning solution and wash in H 2 O for 3 min. 
Repeat wash four times. 

4. Transfer slides to 30 slide plastic racks and place into small 
plastic boxes for coating. 

5. Submerge slides in 200 ml poly-L-lysine solution per box. 

6. Place slide boxes on platform shaker for 1 h at 60 rpm. 

7. Rinse slides three times with H 2 O. 

8. Submerge slides in H 2 O for 1 min. 

9. Spin slides in centrifuge for 2 min at 400 xg and dry slide 
boxes used for coating. 

10. Place slides back into slide box used for coating and let stand 
overnight before transferring to new slide box for storage. 
This allows the coating to dry before handling. 

1 1 . Allow slides to age for 2 weeks on the bench, in a new slide 
box, before printing on them. The coating dries slowly, 
becoming more hydrophobic with time. 

At the end of the print, slides are removed from the printer, 
labeled with the print identifier and the slide number by writing 
on the edge of the slide with a diamond scribe, and placed in a 
dust-free slide box to age for 1 week. It is useful to etch a line, 
which outlines the printed area of the slide, onto the first slide. 
This serves as a guide to locate the area after the slides have been 
processed, and the salt spots are washed off. 

1. Place slides printed face up, in casserole dish and cover with 
cling wrap. Expose slides to a 450 mj dose of ultraviolet 
irradiation in the Stratalinker. Slides should have been aged 
at ambient temperature in a closed slide box for 1 week prior 
to blocking. 

2. Transfer slides to a 30-slide stainless steel rack and place rack 
into a small glass tank. 

3. Dissolve 6.0 g succinic anhydride in 325 ml l-methyl-2- pyr- 
rolidinone in a glass beaker by stirring with a stir bar. Nitrile 
gloves should be worn and work carried out in a hemical fume 
hood while handling 1 -methyl-2 -pyrrolidinone (a teratogen). 

4. Add 25 ml 1 M sodium borate buffer (pH 8.0) to the beaker. 
Allow the solution to mix for a few seconds, and then pour 
rapidly into glass tank with slides. Succinic anhydride hydro- 
lyzes quite rapidly once the aqueous buffer solution is added. 
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17.2.4. Printing 



To obtain quantitative passivation of the poly-L-lysine coating, 
it is critical that the reactive solution be brought in contact 
with the slides as quickly as possible. 

5 . Place the glass tank on a platform shalcer in a fume hood for 20 
min. Small particulates resulting from precipitation of reaction 
products will be visible in the fluid. 

6. While the slides are incubating on the shaker, prepare a boiling 
H 2 O bath to denature the DNA on the slides. 

7. After the slides have incubated for 20 min, transfer them into 
the boiling H 2 O bath. Immediately turn off the heating ele- 
ment after submerging the slides in the bath. Allow slides to 
stand in the H 2 O bath for 2 min. 

8. Transfer the slides into a glass tank filled with 100 % ethanol 
and incubate for 4 min. 

9. Remove the slides and centrifuge at 400 rpm for 3 min in a 
horizontal micro titer plate rotor to dry the slides. 

10. Transfer slides to a clean, dust-free slide box and let them 
stand overnight before hybridizing. 

The variety of printers and pens for transferring PCR products 

from titer plates to slides precludes highly detailed descriptions of 

the process. The following steps provide a general description of 

the processing: 

1. Pre-clean the print pens according to the manufacturer’s 
specification. 

2. Load the printer slide deck with poly-L-lysine-coated slides. 

3. Thaw the plates containing the purified EST PCR products 
and centrifuge briefly, 2 min, at 1,000 rpm in a horizontal 
microtiter plate rotor to remove condensation and droplets 
from the seals before opening. 

4. Transfer 5-10 pi of the purified EST PCR products to a plate 
that will serve as the source of solution for the printer. 

5 . Printing with quill-type pens usually requires that the volume 
of fluid in the print source is sufficiently low and that when the 
pen is lowered to the bottom of the well, it is submerged in 
the solution to a depth of less than a millimeter. This keeps the 
pen from carrying a large amount of fluid on the outside of 
the pen shaft and producing variable, large spots on the first 
few slides printed. 

6 . Run a repetitive test print on the first slide. In this operation, the 
pens are loaded with the DNA solution, and then the pens 
serially deposit this solution on the first slide in the spotting 
pattern specified for the print. This test is run to check the size 
and shape of the specified spotting pattern, and its placement on 
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17.2.5. Sample 
Labeling 



the slide. It also serves to verify that the pens are loading and 
spotting and that a single loading will produce as many spots as 
are required to deliver material to every slide in the printer. 

7. If one or more of the pens is not performing at the desired 
level, re-clean or substitute another pen and test again. 

8 . If all pens are performing, carry out the full print. 

1 . If using an anchored oligo dT primer, anneal the primer to the 
RNA in the following 17 pi reaction (use a 0.2 ml thin wall 
PCR tube so that incubations can be carried out in a PCR 
cycler). 

2. Component addition for Cy5 labeling addition for Cy3 labeling. 

3. Total RNA (>7 mg/ml) 150-200 pg 50-80 pg. 

4. Anchored primer (2 pg/pl) I pi I pi 

5. DEPC H20 to 17 pi to 17 pi 

6. If using an oligo dT( 12-18) primer, anneal the primer to the 
RNA in the following 17 pi reaction. 

7. Component addition for Cy5 labeling addition for Cy3 labeling. 

8. Total RNA (>7 mg/ml) 150-200 pg 50-80 pg. 

9. dT(I2-I8) primer (I pg/pl) I pi I pi. 

10. DEPC H20 to 17 pi to 17 pi. 

11. The incorporation rate for Cy5-dUTP is less than that of Cy3- 
dUTP, so more RNA is labeled to achieve more equivalent 
signal from each species. 

12. Heat to 65 °C for 10 min and cool on ice for 2 min. 

13. Add23 pi of reaction mixture containing either Cy5-dUTP or 
Cy3-dUTP nucleotides, mix well by pipetting, and use a brief 
centrifuge spin to concentrate in the bottom of the tube: 



Reaction mixture (23 pi) 



5 X first strand buffer 


8 pi 


10 X low T dNTPs mix 


4 pi 


Cy5 or Cy3 dUTP (1 mM) 


4 pi 


MDTT 


4 pi 


Rnasin (30 u/pl) 


1 pi 


Superscript II (200 u/pl) 


2 pi 



Superscript polymerase is very sensitive to denaturation at air/ 
liquid interfaces, so be very careful to suppress foaming in all 
handling of this reaction. 
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17.2.6. Hybridization 



1. Incubate at 42 °C for 30 min. Then add 2 )tl Superscript II. 
Make sure the enzyme is well mixed in the reaction volume 
and incubate at 42 °C for 30-60 min. 

2. AddSplofO.SMEDTA. 

3. Add 10 pi I N NaOH and incubate at 65 °C for 60 min to 
hydrolyze residual RNA. Cool to room temperature. 

4. Neutralize by adding 25 pi of I M Tris-HCl (pH 7.5). 

5 . Desalt the labeled cDNA by adding the neutralized reaction, 
400 pi of TE pH 7.5, and 20 pg of human COt-I DNA to a 
MicroCon 100 cartridge. Pipette to mix and spin for 10 min 
at 500 xg. 

6. Wash again by adding 200 pi TE pH 7.5 and concentrating to 
about 20-30 pi (approximately 8-10 min at 500xg). 

7. Recover by inverting the concentrator over a clean collection 
tube and spinning for 3 min at 500 xg. 

In some cases, the cy5 -labeled cDNA will form a gelatinous 
blue precipitate that is recovered in the concentrated volume. The 
presence of this material signals the presence of contaminants. The 
more extreme the contamination, the greater the fraction of 
cDNA which will be captured in this gel. Even if heat solubilized, 
this material tends to produce uniform, nonspecific binding to the 
DNA targets. When concentrating by centrifugal filtration, the 
times required to achieve the desired final volume are variable. 
Overly long spins can remove nearly all the water from the solu- 
tion being filtered. When fluor-tagged nucleic acids are concen- 
trated onto the filter in this fashion, they are very hard to remove, 
so it is necessary to approach the desired volume by conservative 
approximations of the required spin times. If control of volumes 
proves difficult, the final concentration can be achieved by evapor- 
ating liquid in the speed vac. Vacuum evaporation, if not to 
dryness, does not degrade the performance of the labeled cDNA. 

1. Take a 2-3 pi aliquot of the Cy5 -labeled cDNA for analysis, 
leaving 18-28 pi for hybridization. 

2. Run this probe on a 2 % agarose gel (6 cm wide x 8.5 cm 
long, 2 mm wide teeth) in Tris Acetate Electrophoresis Buffer 
(TAE). 

3. Scan the gel on a molecular dynamics storm fluorescence 
scanner (setting: red fluorescence, 200 pm resolution, 1,000 
Von PMT). 

This protocol describes the conditions for hybridizing fluor- 
tagged cDNA representations of the mRNA pools of the samples 
to the EST PCR products immobilized on the glass microarrays. 
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17.2.6.1. Materials Microarray Hybridization Chamber 

• 50 X Denhardt’s blocldng solution 

• Pd (A)40-60 resuspend at 8 mg/ml, and store frozen — 20°C 
(#27-7988, Amersham Pharmacia Biotech.) 

• 20 X SSC 

• Yeast tRNA 

• 10 % SDS 

• Coverslips 

• Forceps 

• Coplin jars 

• 0.2 ml thinwall PCR tubes 

• 65 °C water bath 

• Thermocycler for 0.2 ml thinwall PCR tubes 

• Microarray scanner 

• Image analysis software 

Reagents and Solutions 

• 0.5 X SSC/0.01 % SDS washing buffer. 

• Add 25 ml 20x SSC to 974 ml DEPC H 2 O. 

• Sterile filter on a 0.5 pm filter device. 

• Add 1 ml 10 % SDS and mix well. 

• Store at room temperature. 

0.06 X SSC Washing Buffer 

• Add 3 ml 20 X SSC to 997 ml DEPC H 2 O. 

• Sterile filter on a 0.5 pm filter device. 

• Store at room temperature. 

10 mg/ml Human COt-1 DNA 

• Add 925 pi 100 % ethanol and 75 pi 3 M sodium acetate 
(pH 5.2) to 500 pi Human COt-1. 

. DNA(lpg/pl). 

• Centrifuge at 14,000xg to pellet. 

• Aspirate off supernatant and allow to air dry for 5 min. 

• Resuspend the pellet in 50 pi DEPC H 2 O. 
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Yeast tRNA 

• Resuspend yeast tRNA at 10 mg/ml in DEPC (based on the 
Supplier’s quantitation) in a 1.5 ml polypropylene conical 
centrifuge tube. 

• Add one-half volume of neutralized phenol and vortex. 

• Add one-half volume of chloroform and vortex 

• Centrifuge 5 min at 10,000 xg. 

• Transfer aqueous layer to a new 1.5 ml polypropylene conical 
centrifuge tube. 

• Add 1 volume of chloroform and vortex. 

• Centrifuge 5 min at 10,000 xg. 

• Repeat chloroform extraction. 

• Transfer aqueous layer to a new 1.5 ml polypropylene conical 
centrifuge tube. 

• Add 0.1 volume of 3 M sodium acetate (pH 5.2). 

• Add 2 volumes of ethanol. 

• Centrifuge 5 min at 10,000 xg. 

• Aspirate off supernatant. 

• Add 1 volume of 70 % ethanol. 

• Centrifuge 5 min at 10,000 xg. 

• Aspirate off supernatant. 

• Allow pellet to dry. 

• Resuspend in DEPC water at the original volume. 

• Determine the RNA concentration by spectrometry. 

• Dilute to 4 mg/ml and store frozen at —20 °C. 

17.2.6.2. Steps for 
Hybridization and Washing 



1 . Determine the volume of hybridization solution required. The 
rule of thumb is to use 0.033 pi for each mm^ of slide surface 
area covered by the coverslip used to cover the array. An array 
covered by a 24 mm by 50 mm coverslip will require 40 pi of 
hybridization solution. The volume of the hybridization solu- 
tion is critical. When too little solution is used, it is difficult to 
seat the coverslip without introducing air bubbles over some 
portion of the arrayed ESTs, and the coverslip will not sit at a 
uniform distance from the slide. If the coverslip is bowed 
toward the slide in the center, there will be less labeled cDNA 
in that area and hybridization will be nonunifbrm. When too 
much volume is applied, the coverslip will move easily during 
handling, leading to misplacement relative to the arrayed ESTs, 
and non-hybridization in some areas of the array. 

2. Eor 40 pi hybridization, pool the Cy3- and Cy5 -labeled 
cDNAs into a single 0.2 ml thin wall PCR tube and adjust 
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the volume to 30 pi by either adding DEPC H 2 O or removing 
water in a Speed Vac. If using a vacuum device to remove 
water, do not use high heat or heat lamps to accelerate evapo- 
ration. The fluorescent dyes could be degraded. 

3. For a 40 pi hybridization combine the following components: 



High sample blocking high array blocking 



Cy5 + Cy3 probe 


30 pi + 28 pi 


Poly d(A) (8 mg/ml) 


1 pi + 2 pi 


Yeast tRNA (4 mg/ml) 


1 pi + 2 pi 


Human COt-1 DNA (10 mg/ml) 


1 pi + 0 pi 


20x SSC 


6 pi + 6 pi 


50 X Denhardt’s blocking solution 


1 pi + 2 pi 


Total volume 


40 pi + 40 pi 



4. Mix the components well by pipetting, heat at 98 °C for 2 min 
in a PCR cycler, cool quickly to 25 °C, and add 0.6 pi of 10 % 
SDS. 

5. Centrifuge for 5 min at 14,000 xg. 

6. Apply the labeled cDNA to a 24 mm x 50 mm glass coverslip 
and then touch with the inverted microarray. 

7. Place the slide in a microarray hybridization chamber, add 5 pi 
of 3x SSC in the reservoir, if the chamber provides one, or at 
the scribed end of the slide, and seal the chamber. Submerge 
the chamber in a 65 °C water bath and allow the slide to 
hybridize for 16-20 h. 

8 . Remove the hybridization chamber from the water bath, cool, 
and carefully dry off. Unseal the chamber and remove the 
slide. 

9. Place the slide, with the coverslip still affixed, into a Coplin jar 
filled with 0.5 X SSC/0.01 % SDS wash buffer. Allow the cover- 
slip to fall from the slide and then remove the coverslip from the 
jar with a forceps. Allow the slide to wash for 2-5 min. 

10. Transfer the slide to a fresh Coplin jar filled with 0.06x SSC. 
Allow the slide to wash for 2-5 min. 

11. Transfer the slide to a slide rack and centrifuge at low rpm 
(700-1,000) for 3 min in a clinical centrifuge equipped with a 
horizontal rotor for micro titer plates. 



17.2.7. Laser Scanning This protocol describes how to scan a hybridized slide using a 

of a Microarray Scanarray 3000 scanner. Turn on the scanner and then start the 

computer. Open the Scanarray software. Two image windows will 
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appear inside the Scanarray window, one for each channel. Inser- 
tion of slide for reading and click on the eject slide button. Insert 
the microarray slide array surface up into the slide carriage that 
appears specification of array extent. Select Start from the Acquire 
menu; the Acquire Image window will appear. Set the scan size by 
selecting the Custom radio button. Enter I.O mm as the X start 
position and 17.5 mm as the Y start position to scan. In the X box, 
enter 20 mm (the width of the array); in the Y box, enter 40 mm 
(length of array). 

Do the following for each channel. Complete both steps for 
channel 2 (the Cy5 channel) and then repeat with channel I 
(Cy3). Always start with channel 2 in scanning. To assess if laser 
power and PMT settings for the channel will yield overall inten- 
sities covering the range of intensity values (0-65,535), perform a 
Quick Scan (50 pm resolution). Set laser power to 65 % of 
maximum and PMT voltage to 80 % of maximum as an initial 
approximation. Check the Quick Scan box, select the channel 
being scanned, and click Acquire. Be sure that only a single 
channel box is checked; otherwise both channels will be scanned. 
The image from the channel will appear in one of the image 
windows as the scan is made. 


17.3 Spot 
Quantization 


Spot quantization is the assigning of numerical values to spots 
imaged by the scanner system. The fluorescence signals in each 
spot are encoded in a pixellated image file. The spot quantization 
process is based on standard image processing and recognition 
technology which has been adapted to find circular spots in a 
regular grid pattern. 

The process begins with the output of a laser scanner system, 
which is two image files, in tagged image format (TIFF), Windows 
Bitmap (BMP), or other common image format. The image con- 
sists of a grid of pixels, each of which has a 16 bit grayscale (see the 
Laser Scanning summary). Quantitation begins with the reading 
of these pixellated images into a quantization software package. 
Packages currently in use include Biodiscovery’s Imagene 3.0 [I], 
Michael Eisen’s freeware package Scanalyze [2], and Imaging 
Research’s Array Vision [3]. 

The basic unit of quantization is the microarray spot, typically 
around 100 mm in diameter. Scanner resolution is typically 
10 mm, so there are approximately 75 pixels per spot. A well- 
captured spot should have sharp edges and only a small amount of 
variation in its individual pixel values. Often, a Altering step is 
performed by the quantization software to smooth outlier pixel 
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values (a single intense pixel in a background of low intense pixels, 
for example); by applying a moving median or average filter. The 
quality of the pixellated spot on the image ultimately depends on 
the physical spotting process, how well the spot DNA was cross- 
linked to the slide substrate, and/or how well mixed the hybridi- 
zation solution was upon application to the spot DNA. In order to 
compare Cy3 and Cy5 signal in a given spot, the pixels in its Cy3 
image must be matched with the corresponding pixels in the CyS 
image. The software package will have a feature to align each 
spot’s Cy3 and CyS image. Features will be provided to translate, 
rotate, shrink, and expand one image relative to the other, to 
obtain accurate superposition. 


17.4 Data 
Normalization and 
Statistical Analysis 


Before it is possible to draw biological conclusions or to apply 
sophisticated statistics, it is important to normalize the data. This 
corrects for systematic biases resulting basically from different 
amounts of RNA used for labeling, different incorporation effi- 
ciencies of the Cy3 and CyS dyes in the labeling protocols, and 
different detection efficiencies of the dyes [9, 10] The data 
obtained are normalized using the software on the assumption 
that the log transformed data approach normal distribution. Usu- 
ally the global normalization method and the CyS and CyS inten- 
sities are related by a constant fector (CyS = kCy3) and the 
expression ratio of the average or median gene of the population 
is zero (CyS/Cy3 ratio for an average gene is one). The average 
gene is based on the observation that the majority of the genes do 
not change their expression levels when comparing two mRNA 
populations. Significance analysis of microarrays (SAM) is a statis- 
tical technique to determine whether changes in gene expression 
are statistically significant. With the advent of DNA microarrays it 
is now possible to measure the expression of thousands of genes in 
a single hybridization experiment. The data generated are consid- 
erable and a method for sorting out what is significant and what is 
not is essential. SAM is distributed by Stanford University in an R- 
package. SAM identifies statistically significant genes by carrying 
out gene-specific t- tests and computes a statistic dj for each gene j, 
which measures the strength of the relationship between gene 
expression and a response variable [11, 12]. This analysis uses 
nonparametric statistics, since the data may not follow a normal 
distribution. The response variable describes and groups the data 
based on experimental conditions. In this method, repeated per- 
mutations of the data are used to determine if the expression of 
any gene is significant related to the response. The use of 
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17.5 Applications 
of Microarray 
Technology 

17.5.1. Gene Discovery 



17.5.2. Disease 
Diagnosis 



17.5.3. Drug Discovery 



17.5.4. Toxicological 
Research 



17.5.5. Microarray in 
Microbiology 



permutation- based analysis accounts for correlations in genes and 
avoids parametric assumptions about the distribution of individual 
genes. This is an advantage over other techniques (for example, 
ANOVA and Bonferroni), which assume equal variance and/or 
independence of genes. SAM is available for download online at 
http;//www-stat. stanford.edu/~tibs/SAM/ for academic and 
non-academic users. 



DNA Microarray technology helps in the identification of new 
genes and knowing about their functioning and expression levels 
under different conditions. 

DNA Microarray technology helps researchers learn more about 
different diseases such as heart diseases, mental illness, infectious 
disease, and especially the study of cancer. Until recently, different 
types of cancer have been classified on the basis of the organs in 
which the tumors develop. Now, with the evolution of microarray 
technology, it will be possible for the researchers to further classify 
the types of cancer on the basis of the patterns of gene activity in 
the tumor cells. This will tremendously help the pharmaceutical 
community to develop more effective drugs as the treatment 
strategies will be targeted directly to the specific type of cancer. 

Microarray technology has extensive application in Pharmacoge- 
nomics [13, 14]. Pharmacogenomics is the study of correlations 
between therapeutic responses to drugs and the genetic profiles of 
the patients. Comparative analysis of the genes from a diseased 
and a normal cell will help the identification of the biochemical 
constitution of the proteins synthesized by the diseased genes. 
The researchers can use this information to synthesize drugs 
which combat with these proteins and reduce their effect. 

Microarray technology provides a robust platform for the research 
of the impact of toxins on the cells and their passing on to the 
progeny. Toxicogenomics establishes correlation between 
responses to toxicants and the changes in the genetic profiles of 
the cells exposed to such toxicant. 

Microarrays can be used in microbiology for a multitude of differ- 
ing applications, from the study of gene regulation and bacterial 
response to environmental changes, genome organization, and 
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evolutionary questions up to taxonomic and environmental stud- 
ies. The loiowledge of the main aspects of this technology helps to 
understand these specific applications. Vital for further advances 
of microarray technology in microbiology will be the recognition 
of the importance of the physiological experiments ahead of the 
transcription analysis, the standardization of protocols and con- 
trols for transcription analysis, more integration of the data analy- 
sis with biochemical and genetic knowledge, and flexible and 
intuitive databases for mining the vast amounts of data [15]. 
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Microarray Analysis of Different Functional Genes 
of Microorganisms 

Hirak Ranjan Dash and Surajit Das 

Abstract 

Microorganisms play an integral and unique role in ecosystem function and sustainability. Understanding 
the structure and composition of microbial communities and their responses and adaptations to environ- 
mental perturbations such as toxic contaminants, climate changes, and agricultural and industrial practices 
is critical in maintaining or restoring desirable ecosystem functions. The DNA microarray (or microchip) 
technology is a powerful tool for studying gene expression and regulation on a genomic scale and 
detecting genetic polymorphism in both prokaryotes and eukaryotes. Compared to conventional mem- 
brane-based hybridization, glass slide-based microarrays offer the additional advantages of rapid detec- 
tion, lower cost, automation, and low background levels. Microarray-based genomic technique is 
potentially an extremely powerful tool for characterizing microbial communities and their biological 
functions. Hence, in this chapter, we will focus on the detection of expression level of certain functional 
genes in Pseudomonas Staphylococcus and viruses. 



18.1 Introduction 



An array is an orderly arrangement of samples where matching of 
known and unknown DNA samples is done based on pairing rules. 
An array experiment makes use of common assay systems such as 
microplates or standard blotting membranes. The sample spot 
sizes are typically less than 200 pm in diameter and usually contain 
thousands of spots. Thousands of spotted samples known as 
probes (with known identity) are immobilized on a solid support 
(a microscope glass slides or silicon chips or nylon membrane). 
The spots can be DNA, cDNA, or oligonucleotides. These are 
used to determine complementary binding of the unknown 
sequences thus allowing parallel analysis for gene expression and 
gene discovery. An experiment with a single DNA chip can pro- 
vide information on thousands of genes simultaneously. An 
orderly arrangement of the probes on the support is important 
as the location of each spot on the array is used for the identifica- 
tion of a gene. DNA microarray can also be applied in research for 
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gene expression profiling, that is, identification of changes in 
inRNA expression of strains exposed to particular substrate, for 
example, a specific xenobiotics. In microarray technique sequence 
information is required to design probes. However this approach 
cannot be applicable for discovering new catabolic genes for which 
no sequences are available in the databases. Moreover, loiowledge 
of the entire sequence is not necessary for the construction of 
microarrays, and PCR products of a random genomic library 
constructed from a microorganism of interest may be used. It is 
expected that to a toxic substrate, differential gene expression 
would result at the transcript level. This is reflected by differential 
hybridization patterns in the presence or absence of the toxic 
pollutant. Afterward, clones of the library associated with differ- 
entially hybridized probes can be picked up for sequencing. To 
analyze the expression level of various functional genes in different 
microorganisms, the basic protocol is the same in different organ- 
isms; however, the gene chip where the oligonucleotides are 
immobilized varies, which is specific for that specific organism. 
Microarray can be applied to analyze the xenobiotic degrading 
genes in environmental bacteria, toxic genes in clinical bacteria, 
and expression level of various genes in host due to the infection 
of a specific virus. 

Pseudomonas are gram-negative, motile, aerobic, non-spore- 
forming bacteria demonstrating great deal of metabolic diversity 
possessing the characteristic features of opportunistic pathogens 
to human beings, biofilm formation, and are responsible in the 
process of bioremediation. Pseudomonas s'pccies, have been reported 
to be used in the process of bioremediation of polycyclic aromatic 
hydrocarbons, toluene, carbazole, aromatic organic compounds, 
carbon tetrachloride, and almost all heavy metals [I]. As Pseudomo- 
nas 'possess multifaceted properties in bioremediation point of view, 
it can be used as a potential model organism for the study of 
microbial bioremediation. 

Staphylococcus species are potential cause of various diseases in 
human causing slcin infection, eye infection, and abdominal infec- 
tion which can lead to organ damage, damage to the eye, or even 
death. Though some of the strains of Staphylococcus the normal 

microflora of the body, they can change their role by becoming 
opportunistic pathogens. Staphylococcus ofupmost importance 
due to their rapid acquisition of antibiotic resistance properties as 
Methicillin resistant Staphylococcus aureus (MRSA) or Vancomycin 
resistant Staphylocoecus aureus (VRSA) [2]. Microarray analysis of 
Staphyloeoceus can reveal the diagnosis of these organisms at genus 
and species level and the molecular mechanism antibiotic resis- 
tance by detecting various genes. 

Microarray has been evolved as a potential tool for the diag- 
nosis of a viral infection and to study the expression level of various 
genes in the host due to a particular viral infection [3]. Due to 
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infection of a specific virus, the regulation patterns of genes are 
changed in the host. Targeting the detection of genes that are 
upregulated or downregulated in host, the cause of viral infection 
and the diagnosis of the disease becomes easier. In some cases, for 
the identification of a particular virus, the gene chips are con- 
structed using the oligonucleotides of the genome of the virus and 
by targeting the genes of the specific virus. Analysis of viruses like 
HIV and influenza virus using Microarray is common nowadays. 



18.2 Materials 



18.2.1. Total RNA 
Isolation 


I. 


QIAGEN®' RNeasy Mini Purification Kit 


18.2.2. Preparation of 
Poly-A Controls 


I. 


Poly-A RNA control kit 


18.2.3. cDNA Synthesis 


I. 


RNA/primer hybridization mix 




2. 


5x 1st standard buffer 




3. 


100 mM DTT 




4. 


10 mM dNTPs 




5. 


SUPERase In (20 U/pl) 




6. 


Superscript II (200 U/pl) 


18.2.4. Removal of 


I. 


I N NaOH 


DMA 


2. 


I NHCl 


18.2.5. Purification 


I. 


MiniElute PCR purification columns 


and Quantification of 
cDNA 


2. 


Spectrophotometer 


18.2.6. cDNA 


I. 


10 X DNase I Buffer 


Fragmentation 


2. 


cDNA 




3. 


DNase I 




4. 


Nuclease free water 


18.2.7. Terminal 


I. 


5 X reaction buffer 


Labeling 


2. 


GeneChip DNA labeling reagent, 7.5 mM 




3. 


Terminal deoxynucleotidyl transferase 




4. 


Eragmentation cDNA product 




5. 


Water 
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18.2.8. Target 
hybridization, 
Washing, Staining, 
and Scanning 



1 . GeneChip hybridization, wash, and stain kit 

2. Hybridization module, Boxl 

3 . 2X Hybridization mix 

4. DMSO 

5 . Nuclease free water 

6. Control Oligo B2, 3 iiM 

7. Hybridization Oven 640 

8. Sterile, RNase free, microcentrifuge vials, 1.5 ml 

9. Micropipette 

10. Sterile barrier pipette tips and non- barrier tips 

1 1 . Stain Cocktail 1 

12. Stain Cocktail 2 

13. Array holding buffer 

14. Wash buffer A and B 

15. RNase free water 

16. Fluidics Station 450 

17. GeneChip® Scanner 3000 

18. Tough-Spots™, Label Dots 

19. Experiment and Fluidics Station Setup 



18.3 Method 



18.3.1. Total RNA 
Isolation 



1. Bacterial culture were grown in liquid basal media and the 
overnight culture was harvested by centrifugation. 



2. Total RNA can be isolated by using standard procedures for 
bacterial RNA isolation QIAGEN®' RNeasy Mini Purification 
Kit. 



18.3.2. Preparation of 
Poly-A RNA Controls 



1 . The Poly-A RNA Control Stock and Poly-A Control Dilu- 
tion buffer are generally provided with the Poly-A RNA 
Control Kit to prepare the appropriate serial dilutions 
based on the following recommendations (Table 18.1). 



18.3.3. cDNA Synthesis 



1. Prepare the following mixture for primer annealing. 

2. Incubate the RNA/Primer mix at the following tempera- 
tures. 



70 °C for 10 min 
25 °C for 10 min 
Chill to 4 °C 
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Table 18.1 

Serial dilutions of poly-A RNA control stock 
Array format Serial dilution Spike-in volume (til) 





First 


Second 




169 Format (Mini) 


1:20 


1:16 


2 


100 Format (Midi) 


1:20 


1:20 


2 


49 Format (Standard) 


1:20 


1:13 


2 



Table 18.2 

Ingredients for cDNA synthesis 



Ingredients 


Volume (pi) 


Final 

dilution 


RNA/Primer hybridization mix 
(from previous step) 


30 


- 


5 X 1st Standard buffer 


12 


lx 


100 mM DTT 


6 


10 niM 


10 mM dNTPs 


3 


0.5 niM 


SUPERase In (20 U/pl) 


1.5 


0.5 U/pl 


Superscript 11 (200 U/pl) 


7.5 


25 U/pl 


Total volume 


60 





18.3.4. Removal of 
RNA 



3. Prepare the reaction mix for cDNA synthesis. Briefly centri- 
fuge the reaction tube to collect sample at the bottom and 
add the cDNA synthesis mix from Table 18.2 to the RNA/ 
primer hybridization mix 

4. Incubate the reaction at the following temperatures: 

• 25 °C for 10 min 

• 37 °C for 60 min 

• 42 °C for 60 min 

• Inactivate Superscript II at 70 °C for 10 min 

• Chill to 4 °C 

1. Add 20 pi of 1 N NaOH and incubate at 65 °C for 30 min. 

2. Add 20 pi of 1 N HCl to neutralize. 
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Table 18.3 

Fragmentation reaction 



Ingredients 


Volume 


Dilution 


10 X DNase I Buffer 


2 pi 


lx 


cDNA 


10 pi 


- 


DNase I 


Xpl 


0.6 U/pg of cDNA 


Nuclease free water 


Up to 20 pi 


- 


Total volume 


20 pi 





18.3.5. Purification 
and Quantification 
OFcDNA 



18.3.6. cDNA 
Fragmentation 



18.3.7. Terminal 
Labeiing 



18.3.8. Target 
Hybridization, 
Washing, Staining 
and Scanning 



1 . Quantify the purified cDNA product by 260 nm absorbance. 

2. Use MiniElute PCR Purification Columns to clean up the 
cDNA synthesis product. Elute the product with 12 pi of 
EB. The average volume of the elute is 11 pi from 12 pi of 
elution buffer. 

3. Typical yield of cDNA are 3-7 ng. A minimum of 1.5 pg of 
cDNA is required for subsequent procedures to obtain suffi- 
cient material to hybridize onto the array and to perform 
necessary quality control experiments. 

1. Prepare the following reaction mix (Table 18.3). 

2. Incubate the reaction at 37 °C for 10 min. 

3. Inactivate DNase 1 at 98 °C for 10 min. 

4. The fragmented cDNA is applied directly to the thermal 
labeling reaction. 

5. Alternatively, the material can be stored at —20 °C for later 
use. 

1. Prepare the following reaction mix (Table 18.4): 

2. Incubate the reaction at 37 °C for 60 min. 

3. Stop the reaction by adding 2 pi of 0.5 M EDTA. 

4. The target is ready to be hybridized onto probe arrays. 
Alternatively it may be stored at — 20 °C for later use. 

1. Prepare the following hybridization cocktail. 

2. Equilibrate probe array to room temperature immediately 
before use. 
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Table 18.4 

Ingredients for terminal label reaction 



Ingredients 


Volume 


5 X Reaction buffer 


10 pi 


GeneChip DNA labeling reagent, 7.5 mM 


2 pi 


Terminal deoxynucleotidyl transferase 


2 pi 


Fragmentation cDNA product 


Up to 20 pi 


Water 


16 pi 


Total volume 


50 pi 


3 . Based on the format of the array type used, add the appro- 
priate volume of hybridization cocktail. 


Array 


Volume 


49 Format (Standard) 


200 pi 


100 Format (Midi) 


130 pi 


169 Format (Mini) 


80 pi 



4. Place probe array in the hybridization oven set at 50 °C. 

5 . To avoid stress to the motor, load probe arrays in a balanced 
configuration around axis. Rotate at 60 rpm. 

6. Hybridize for 16 h. Prepare reagents for the washing and 
staining steps required immediately after completion of 
hybridization. 

Step 1 : Defining File Locations 

1 . Launch Microarray Suite from the workstation and select. 

Tools ^ Defaults ^ File Locations from the menu bar 

2. The file Locations windows displays the locations of the 
following files: 

• Probe Information (library files, mask files) 

• Fluidics Protocols (fluidics station scripts) 

• Experiment Data (.exp, .dat, .cel, and .chp fil 

3. Verify that all three file locations are set correctly and click 
OK. 
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Step 2: Entering Experiment Information 
To wash, stain and scan a probe array, an experiment must first 
be registered in GCOS or Microarray Suite. The field of informa- 
tion required for registering experiments in Microarray Suite are: 

• Experiment Name 

• Probe Array type 

• Sample Name 

• Sample Type 

• Project 

Step 3: Preparing the Eluidics Station 
The Eluidics Station 400 or 450/250 is used to wash and stain 
the probe arrays. It is operated using GCOS /Microarray Suite. 

1. Turn on the fluidics Station using the switch on the lower 
left side of the machine. 

2. Select Run ^ Eluidics from the menu bar. 

3. Priming should be done after that: 

• When the fluidics station is first started 

• When wash solutions are changed 

• Before washing if a shutdown has been performed 

• If the LCD window instructs the user to prime 

Preparing the stain reagents 

1. Remove Stain Cocktail I, Stain Cocktail 2, and array Hold- 
ing buffer from the stain module. 

2. Gently tap the bottles to mix. 

3. Aliquot the following reagents: 

(a) 600 pi of Stain Cocktail I into a 1.5 ml amber micro- 
centrifuge vial. 

(b) 600 pi of Stain Cocktail 2 into a 1.5 ml clear micro- 
centrifuge vial. 

(c) 800 pi of Array Holding Buffer into a 1.5 ml clear 
microcentrifuge vial. 

4. Spin down all vials to remove the presence of any air bubbles. 

Washing and Staining the Probe Array 

I . In the fluidics Station dialog box on the work station, select 
the correct experiment name from the drop-down experiment 
list. The probe Array type appears automatically. 




1 8 Microarray Analysis of Different Functional Genes of Microorganisms 



289 



2. In the protocol drop-down list, select the appropriate anti- 
body amplification protocol to control the washing and stain- 
ing of the probe array format being used. 

3. Choose Run in the Fluidics Station dialog box to begin the 
washing and staining. Follow the instructions in the LCD 
window. 

4. Insert the appropriate probe array into the designated module 
of the fluidics station while the cartridge lever is down or in 
the eject position. When finished, verify that the cartridge 
lever is returned to the up or engaged position. 

5. Remove any microcentrifuge vials remaining in the sample 
holder of the fluidics station module being used. 

6. Follow the instructions on the LCD window of the fluidics 
station by placing the three experiment sample vials into the 
sample holders 1,2, and 3 on the fluidics station. 

7. When the protocol is complete, the LCD window displays the 
message EJECT & INSPECT CATRIDGE. 

8 . Remove the probe arrays from the fluidics station modules by 
first pressing down the cartridge lever to the eject position. 

9 . Check the probe array window for large bubbles or air pockets . 
If the probe array has no large air bubbles, it is ready to scan on 
the GeneArray® Scanner or the GeneChp® Scanner 3000. 

10. Keep the probe arrays at 4 °G and in dark until ready for 
scanning. 

11. If there are no more samples to hybridize, shut down the 
fluidics station. 

Scanning the Probe Array 

1. Select Run Scanner from the menu bar. Alternatively, click 
the start Scan icon in the tool bar. The scanner dialog box 
appears with a drop-down list of experiments that have not 
been run. 

2. Select the experiment name that corresponds to the probe 
array to be scanned. A previously run experiment can also be 
selected by using the Include Scanned Experiments option 
box. After selecting this option, previously scanned experi- 
ments appear in the drop-down list. 

3. Once the experiment has been selected, click the start but- 
ton. A dialog box prompts you to load the sample into the 
scanner. 

4. Open the sample door on the scanner and insert the probe 
array into the holder. Do not force the probe array into the 
holder. Close the sample door of the scanner. If you are using 
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the GeneChip Scanner 3000, do not attempt to close the 
door by hand. 

5. Click OK in the Start Scanner dialog box. The scanner 
begins scanning the probe array and acquiring data. When 
Scan in progress is selected from the View menu, the probe 
array image appears on the screen as the scan progresses. 
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DNA Cloning and Sequencing 

Bandatnaravuri Kishore Babu, Anu Sharma, and Hari Kishan Sudini 



Abstract 

The plasmid DNA is cleaved with an enzyme and joined in vitro to foreign DNA; the resulting recombi- 
nant plasmids are then used to transform bacteria. The plasmid vectors must be carefully chosen and 
processed to minimize the effort required to identify and characterize recombinants. This chapter 
provides guidelines for preparation of DNA fragment for cloning, transformation into chemically compe- 
tent host, and selection of positive clones. The write-up will also describe basic methods used in the 
cloning of PCR amplified rRNA gene into appropriate vector and followed by sequencing. 



19.1 Introduction 



In principle, cloning in plasmid vectors is very straightforward. The 
easiest fragment to clone carries a noncomplementary protruding 
termini generated by digestion with two different restriction 
enzymes. Since most of the present-day vectors contain poly-linker 
that has multiple cloning sites, it is almost always possible to find 
restriction sites that are compatible with the termini of the foreign 
DNA fragment. The fragment of foreign DNA is then inserted into 
the vector by a process known as directional cloning. 

Fragment of foreign DNA carrying identical termini (either 
blunt-ended or protruding) must be cloned in a linearized plasmid 
vector bearing compatible ends. During the ligation reaction, the 
foreign DNA and the plasmid DNA should have the capacity to 
circularize and to form tandem oligomers. It is therefore necessary 
to carefully adjust the concentrations of the two types of DNA in 
the ligation reaction to optimize the number of correct ligation 
products. In addition, removal of the 5'-phosphate groups with 
alkaline phosphatase will help to suppress self-ligation and circu- 
larization of the plasmid DNA. 
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19.2 Materials 



1. Glassware, instruments, and other materials: Electrophoresis 
tank (horizontal/submarine), UV transilluminator, sterile 
microflige tubes, micropipettes, sterile micropipette tips, 
shaker incubator, sterile 50 ml polypropylene centrifuge, ice- 
box, microflige, vortex mixture, water bath, etc. 

2. Chemicals: 

(a) Agarose 

(b) 50 X TAE buffer 

(c) Ethidium bromide (10 mg/ml) 

(d) Restriction endonucleases (EcoRI, etc.) 

(e) DNA digested with Hinilll 

(f) Isolated plasmids (with and without inserts of foreign 
DNA) 

(g) Calf Alkaline Phosphatase (CIP) 

(h) T4 DNA ligase and 10 x ligase buffer 

(i) Buffer saturated phenol 

(j) Chloroform 

(k) 3 M Sodium acetate (pH 7) 

(l) 3 M Sodium acetate (pH 5.2) 

(m) 10 M Ammonium acetate 

(n) LB broth 

(o) LB agar plates with and without ampicillin 

(p) E. coli DH5a Competent cells 

(q) IPTG (200 mg/ml) 

(r) X-gal (20 mg/ml in dimethylformamide) 

(s) Ptz57R/T vector (MBI Fermentas) or PCR™ II vector 
(Invitrogen) 

(t) Calcium chloride — 0.2 M (prepare 1 M stock solution of 
CaCl 2 and store 10 ml aliquots frozen at —20 °C). Just 
before use, dilute an aliquot to 100 ml with sterile water 
and by filter through 0.45 pm filter, and then chill on ice 
for use. 

(u) BDT v3.1 Reaction Mix (Applied Biosystems 
#4337455) [1] 

(v) 5x Sequencing buffer (Applied Biosystems #4336697) 

(w) Hi-Di formamide (Applied Biosystems) 

(x) 0.5 MEDTApH 8.0 

(y) 125 mM EDTA 

(z) TE buffer 
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19.3 Methods 

19.3.1. Ligation of 
Vector and Foreign 
DNA Fragment 



19.3.2. Preparation of 
Vector DNA and 
Foreign DNA Fragment 
for Cioning 



Ligation of a segment of foreign DNA to a linearized plasmid 
vector involves the formation of new bonds between phosphate 
residues located at the 5'-hdroxyl moieties. Ligation of one end of 
DNA to another can be regarded as a bimolecular reaction whose 
velocity under standard conditions is determined solely by the 
concentration of same DNA molecule (intramolecular ligation) 
or on different molecules (intermolecular ligation). Low concen- 
tration of DNA in the ligation reaction may lead to intramolecular 
ligation, whereas high DNA concentration may result in the for- 
mation of dimers and/or larger oligomers of the plasmid. 

1. Restriction digestion of plasmid DNA (prepared by mini 
prep.) and foreign DNA with the desired endonucleases. 

2. Check 5 pi of the above plasmid for completion of digestion 
in 0.8 % agarose gel. Use undigested plasmid and marker 
DNA for comparison. 

3 . Mix following components in the order mentioned below in a 
microfuge tube on ice: 



To linearize the piasmid; 



Plasmid DNA (1-2 pg) 


5 pi 


lOx RE buffer 


2 pi 


Restriction enzyme 1 (e.g., fJcoRI) 


2 pi 


Restriction enzyme 2 (e.g., Hindlll) 


2 pi 


Deionized distilled water 


to 20 pi 



4. Mix the contents by gentle tapping and pulse spin in a micro- 
fuge to bring down all the liquid to the bottom of the tube 
and incubate at 37 °C for 1 h. 

5. Heat at 65 °C for 10 min to stop the reaction. Chill on ice. 
Add 5 pi of loading dye, mix, and pulse spin. 

Load the digested products on 1 % TAE/agarose gel, with X 
DNA digested with Hind III as marker. 
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19.3.3. Preparation of 
Phosphatase- Treated 
Vector 



1. Restriction digested plasmid will be now proceeding for 
phosphatase treatment as follows: 




19.3.4. Ligation 



2 . Incubate at 37 °C for 30 min. 

3. Add another 1 pi of CIP and continue the incubation for 
30 min. 

4. After 1 h of CIP treatment, add I pi of 0.5 M EDTA (pH 8.0) 
to get a final concentration of 5 mM. 

5. Incubate at 75 °C for 10 min to inactivate the CIP. 

6. Cool the reaction to room temperature and extract once with 
equal volume of phenol and once with equal volume of 
phenol-chloroform mixture. 

7. To the aqueous phase, add O.I volume of 3 M sodium acetate 
(pH 7.0), and 2 volume of ethanol, mix well, and precipitate 
the linear dephosphorylated vector at — 20 °C for 30 min. 

8. Recover the DNA by centrifugation at 4 °C in a microfuge. 

9. Wash the pellet with 70 % ethanol, air dry, and dissolve in 10 pi 
of distilled water. 

10. Run an aliquot of both the vector and the insert DNA to 
estimate the concentration of DNA in the gel before setting 
up the ligation. 

1 . Set up ligation reaction in total volume of 1 0 pi as follows (for 
a foreign DNA fragment that has length equal to vector 
DNA) (Table 19. 1). 

2. If the foreign DNA is smaller than vector, reduce the concen- 
tration of foreign DNA accordingly to bring it equal to molar 
concentration of the vector. 

3. Include necessary conttol like ligation widiout foreign DNA 
fragment (Vector re -circularization control), ligation vector only 
that was not treated with phosphatase (Ligation control), etc. 

4. In each case, adjust the volume to 10 pi with phosphatase 
(Ligation control), etc. 

5. In each case, adjust the volume to 10 pi with H 2 O. 

6. Incubate the reaction for 4-16 h at 16 °C. 

7. Use 2 pi of the ligation mixture for transformation of bacteria. 
Store the remaining ligation reaction at —20 °C for further 



use. 



19.3.5. Transformation 



19.3.6. Competent Cell 
Preparation 
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Table 19.1 

Ligation reaction mixture of 10 I volume 



Ingredients 


Volume 


Final 

concentration 


Vector DNA (100 ng/pl)with 
dephosphorylated 5'termini 


2 |tl 


200 ng 


Foreign DNA fragment (100 ng/|il) 
with compatible phosphorylated 
termini 


2 |tl 


200 ng 

(equimolar to 
vector) 


10 X ligase buffer with 10 mM ATP 


Ipl 


lx 


T4 DNA ligase 


Ipl 


0.1 Weiss Unit 


Deionized distilled water 


To 10 pi 


0.1 Weiss Unit 



When the bacteria are treated with ice-cold solution of CaCli and 
then briefly heated, they could be transfected with plasmid DNA. 
Apparently the treatment induces a transient state of “compe- 
tence” in the recipient bacteria, during which they are able to 
take up DNAs derived from a variety of sources. Most commonly 
used methods yield transformants at a frequency of 10^-10^ 
transformants/pg of supercoiled plasmid. Competent cells of 
the E. coli strains such as JM 107, XL 1-Blue, DH5a, SURE, 
and NM522 can be used for transformation. 

Day 1: Selection of E. coli DH5a on LB Agar Plates 

1 . Streak E. coli DH5a culture either from glycerol stock or from 
any viable culture stored at 4 °C, with platinum loop on LB 
agar plates. 

2. Incubate the LB plate at 37 °C overnight (O/N) and isolate a 
single colony. 

Day 2: Preparation for O/N Culture 

1 . Inoculate 5 ml of LB broth with a single colony from LB agar 
plate. 

2. Let it be grown for O/N in a shaking incubator (150 rpm) at 
37 °C. 

3. Keep the following in the cold room for next day use: 50 ml 
centrifuge tubes, microfiige tubes, 10 and 5 ml glass pipettes, 
CaCl 2 solution. 

Day 3: Preparation of Competent Cells 

1. Inoculate 100 ml LB medium in 1 liter flasks with O/N 
culture (make 1 % inoculums). 
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19.3.7. Transformation 



2. Let the flask grow at 37 °C shaking for 2-3 h. Measure OD at 
A 550 after every 30 min to find out when A 550 is between 0.4 
and 0.5 (i.e., when the cells are in early log phase). 

3. Chill the flaks in ice for 5-10 min. Transfer the culture to pre- 
chilled 50 ml centrifuge tubes. 

4. Centrifuge at 5,000 rpm/5 min at 4 °C. 

5 . Decant the supernatant and suspend the pellet gently with the 
help of a pre-chilled glass pipette in 20 ml of 200 mM CaCli 
and incubate in ice water for 20 min. Centrifuge at 
5,000 rpm/5 min at 4 °C. 

6 . Decant the supernatant and suspend the pellet gently in 4 ml 
80 mM CaCl 2 solution. 

7. Aliquot the cells in microfuge tubes either in 200 pi or its 
multiples, and immediately freeze in liquid nitrogen and store 
the cells at —70 °C. 

8 . If not used immediately, the competent cells may be stored on 
ice O/N. 

1. Thaw a 200 pi aliquot of competent cells on ice. 

2. Add 2 pi of ligation mix containing approximately 50-100 ng 
of DNA to the competent cells. 

3. The volume of DNA/ligation mix to be added to a 200 pi 
aliquot of competent cells should not exceed 10 pi. 

4. Mix the contents of tube by swirling gently. Incubate on ice 
for 30 min. 

5. Heat shock at 42 °C for exactly 120 s. (Do not shake the 
tubes.) 

6 . Rapidly transfer the tubes to ice bath and allow the cells to 
chill for 1-2 min. 

7. Add four volumes (0.8 ml) of LB medium and keep in a water 
bath at 37 °C for 1 h to allow the bacteria to recover and 
express the antibiotic resistance marker encoded by the 
plasmid. 

8 . Centrifuge at 5,000 rpm for 5 min to settle down the cells. 
Remove medium and resuspend the cells in the remaining 
medium and plate on the LB ampicillin plates with IPTG 
and X-gal. 

9. Include following control: 

(a) No DNA control 

(b) Transformation efficiency control 

(c) Incubate the plates at 37 °C O/N 




19.3.8. Selection of 
Positive Clones 

19.3.8.1. Restriction 
Anaiysis 



19.3.8.2. Insertionai 
Inactivation 



19.3.8.3. Screening by 
Colony Hybridization 



19.3.8.4. 

a -Complementation 



19 DNA Cloning and Sequencing 297 

There are four following methods that are commonly used to 
identify bacterial colonies that contain recombinant plasmids: 

In restriction analysis, a number of independently transformed 
bacterial colonies are picked and grown in small cultures. Plasmid 
DNA isolated from each culture is analyzed by digestion with 
restriction enzyme and gel electrophoresis. 

It can only be used with order vectors (Pbr322) that carry two or 
more antibiotic resistance genes and an appropriate distribution of 
restriction enzyme cleavage sites. The foreign DNA is cloned in 
the plasmid in such a way that it disrupts the reading frame of one 
of the antibiotic resistant genes. Recombinant bacteria are 
screened by growing identical colonies separately on more than 
one antibiotic plate. If the bacteria become sensitive to the antibi- 
otic, whose gene was disrupted by the insertion of the foreign 
DNA, and remains resistant to the others, it indicates that the 
bacteria contain the plasmid having foreign DNA. 

In this method the bacterial colonies are transferred onto a nitro- 
cellulose paper and lysed to release and denature the DNA that is 
immobilized on the nitrocellulose paper. This nitrocellulose paper 
is then hybridized with radioactive labeled DNA fragments that 
were used for cloning. The autoradiogram is aligned with the 
original bacterial plate, from where the colonies were transferred, 
to identify the bacteria carrying foreign DNA. 

Many of the vectors in current (e.g., the pUC Series) carry a short 
segment of E. coli DNA that contains the regulatory sequences 
and the coding information for the first 146 amino acids of the 
(3-galactosidase gene {lac Z). Embedded in this coding region is 
polycloning or multiple cloning site that does not disrupt the 
reading frame but results in the harmless interpolation of a small 
number of amino acids to the amino-terminal fragment of (3-galac- 
tosidase. Vectors of this type are used in host cells that code for the 
carboxy-terminal portion of the (3-galactosidase. Although neither 
the host-encoded nor the plasmid-encoded fragments are them- 
selves active, they can associate to form enzymatically active protein. 
This type of complementation, in which deletion mutants of the 
operator-proximal segment of the lac-Z, gene are complemented by 
(3-galactosidase — negative mutants that have the operator-proximal 
region intact, is called a-complementation. The lac^ bacteria that 
result from a-complementation are easily recognized because they 
form blue colonies in the presence of chromogenic substrate 
5-bromo-4-chloro-3-indolyl-(3-D-glactose (X-gal). However inser- 
tion of a fragment of foreign DNA into the polycloning site of the 
plasmid almost invariably results in the production of an amino- 
terminal fragment that is not capable of a-complementation. 
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19.3.9. Experimental 
Procedure 



19.3.10. Cloning of 
PCR Product Using T/A 
Over Hang 



Bacteria carrying recombinant plasmid therefore form white 
colonies. The development of this simple color test has gready 
simplified the identification of recombinants constructed in plasmid 
vectors of this type. It is easily possible to screen many thousands of 
colonies visually and to recognize colonies that carry putative 
recombinant plasmids. The structure of these plasmids is then 
verified by restriction analysis of mini preparation of plasmid DNA. 

1 . To a premade LB agar plate containing ampicillin, add 40 pi of 
stock solution of X-gal (20 mg/ml) and 4 pi of solution of 
IPTG (200 mg/ml). 

2. Using a sterile glass spreader, spread the solution over the 
entire surface of the plate. 

3. Incubate the plate at 37 °C until the fluid has disappeared. 
(Note: It may take up to 2-3 h, if the plate is freshly made.) 

4. Inoculate the plate with 200 pi of transformation mixture and 
spread the solution over the entire surface of the entire plate 
using a sterile glass spreader. 

5. Incubate the inoculated plate in an inverted position for 
12-16 h at 37 °C. 

6. Store the plate at 4 °C for several hours. This allows the blue 
color to develop fully. 

Pick few blue and few white colonies for analysis by restriction 

7. endonuclease digestion. 

Note: Colonies that contain active (3-galactosidase are pale 
blue in the center and dense blue at the periphery. White 
colonies occasionally show a faint blue spot in the center, 
but these are colorless at the periphery. 



Cloning of PCR product into appropriate vector followed by 
sequencing allows the product identification and characterization. 
The basic methods used in cloning of PCR product include: 

TA cloninj^: Since PCR product generated by Taq polymerase 
is appended with a single extraneous dA at 3' ends, the easiest way 
of cloning is by using a plasmid tailed with dT. 

Blunt end cloning: The blunt end PCR product generated by 
Pwo or Pfu polymerase can be cloned into a plasmid restricted 
with blunt end generating enzymes. 

Directional cohesive end cloninjq: In this case PCR product is 
first restricted with appropriate restriction enzymes followed by 
ligating them onto plasmid linearized by same restriction 
enzymes. 
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19.3.11. Ligation 



19.3.12. Transformation 
and Seiecdon of the 
Ciones 



Table 19.2 

Components of ligation reaction mixture 



Ingredients 


Volume (pi) 


Plasmid vector pTZ57R/T vector (0.165 pg) 


1 


PCRfragment (approx 0.495 pg) 


2 


10 X ligation buffer 


1 


PEG 4000 solution 


1 


Deionized water 


5 


T4 DNA Ligase, 5 U 


1 



The following protocol describes cloning of PCR amplified prod- 
uct using TA vector: 

1. PCR products are first purified to remove enzymes, unused 
primers, dNTPs, etc. For this the PCR product is first run on 
low melting agarose gel, followed by extraction and purifica- 
tion (many commercial gel extraction kits are available). 

Note: The efficiency of ligation is known to be dependent on 
the purity of the PCR fragments and if a single homogenous 
band of desired size is observed on the gel after purification of 
PCR product, it can be directly used in the ligation reaction, 
tpb 6pt 

2. Dissolve the purified PCRfragment in 10-20 pi of TE buffer, 
determine the DNA concentration using nanodrop UV spec- 
trophotometer, or alternatively load 2 pi of DNA into agarose 
gel electrophoresis and compare with the Icnown amount of 
DNA markers. 

3. Calculate the amount of PCR fragment required for ligation 
(in equimolar concentration) using the formula- 

Xnrf of PCR product to ligate = {Tbp of PCR product) - 
(50 np of vector) / {size in bp of the vector) 

Three time “A”ng would be used for a 1:3 molar ratio. 

4. Ligation Reaction Mixture: Ligation reaction mixture has 
been listed in Table 19.2. 

5. Incubate at 22 °C for 1 h. 

6 . For maximum yield, die reaction time can be extended overnight. 

1. Follow the procedure mentioned in the above sections. 

2. Plasmid isolation may be done following the protocol of this 
manual. 
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19.3.13. Release of the 
Insert 



19.3.14. Sequencing of 
PCR Product 

19.3.15. Removal of 
Excess of Salts and 
Enzymes from PCR 
Products 



19.3.16. Exo-SAP 
Digestion of PCR 
Product 



19.3.17. Sequencing 
PCR 



To release the insert; 


Plasmid DNA (1-2 pg) 


— 5 pi 


lOx RE buffer 


—2 pi 


Restriction enzyme 1 (e.g., EcoRI) 


—2 pi 


Restriction enzyme 2 (e.g., Hindlll) 


—2 pi 


Deionized distilled water 


-20 pi 



Sequencing may be done following protocol of “Applied Biosys- 
tems BigDye Terminator v3 . 1 Cycle Sequencing kit” [ 1 ] . 

The PCR amplified products can be subjected for the removal of 

excess salts as follows: 

1. To the template DNA, add enough MQ water to make the 
volume to 100 pi. Add 10 pi of 3 M sodium acetate pH 5.5 
and 250 pi of chilled absolute ethanol. 

2. Mix the contents well and incubate on ice for 20-30 min. 

Note: Incubation at lower temperatures and for longer periods 
may cause precipitation of salts and hence is not recommended. 

3. Spin at 12,000x^ for 20 min and remove the supernatant. 

4. Wash the pellet by adding 500 pi of 70 % ethanol at room 
temperature and centrifuge at 12,000 for 5 min. 

5. Aspirate or decant the supernatant and repeat the 70 % etha- 
nol wash step once more. 

6. Air dry the pellet and resuspend in a suitable volume of water. 

7. Check an aliquot on gel and quantification. 

1. Malce a master mix of Exonudea.se 1 and Shrimp Alka-line 
Phospha-tase (SAP) for 10 pi of PCR product as per Table 19.3. 

2. Add 1 pi of the mastermix to 10 pi of PCR product and set up 
the following incubation protocol in a thermal cycler: 




Both Exol and SAP are active in lx Taq Buffer and can be 
easily denatured by heating to 85 °C for 15 min. 

1 . Make a master mix of your sequencing reaction based on the 
following volumes given in Table 19.4. 
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Table 19.3 

Mastermix of Exo-SAP digestion 



Components 


Units/Rxn. 


lx 


lOOx 


Exol (20 U/pl) 


0.5 


0.025 


2.5 


SAP (1 U/pl) 


0.5 


0.5 


50 


PCR Buffer 10 x 


lx 


0.1 


10 


MilliQ 




0.375 


37.5 


Total volume 




1 


100 


Table 19.4 

Ingredients of master mix of sequencing reaction 




Reaction 


1/4x 


1/8x 




BDT 


1 


0.5 




5 X Seq Buffer 


1.5 


1.75 




Primer 3.2 pM 


3.2 pmol 


3.2 pmol (1 pi) 


Template 


- 


- 




Water 


To 10 pi 


To 10 pi 





2. Thaw out your primer first and add the correct amount of this 
and water to a tube. 

3 . Thaw out the 5 x buffer, mix well, and add the correct amount 
to the tube. 

4. Remove an aliquot of BDT and thaw on ice. Mix well and spin 
down the tube. 

5. Add the correct amount to the reaction mix. 

6. Mix the master mix well by inversion and spin down. The 
master mix is now ready to be aliquoted into strip tubes, 
a plate, or single tubes. 

7. Add the purified template (up to 6.75 pi), typically 1-5 pi for 
300-1,500 bp products depending on concentration, based 
on 3-10 ng for 200-500 bp, 5-20 ng for 500-1,000 bp, or 
lOMO ng for 1,000-2,000 bp. 

8. Seal the plate with PCR film, or tubes as per normal. Mix the 
reaction by vortexing for 3 s. Flick the product back down to 
the bottom of the wells. 

9. Place the plate/tubes in a PCR machine. 
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The reaction is as follows: 
96 °C 10 s 




^This is the extension step and so alter the time to be the same as what you 
would use for a PCR, i.e., <1,000 bp/min 



10. The primer temp can be altered for difficult templates. 

11. Once the sequencing reaction is finished, the samples can be 
stored at —20 ° C for few days or else continue for the cleanup step. 

After sequencing PCR, the amplified products should be carried 
out for the removal of the excess salts, primers, and enzymes as 
follows: 

1. Transfer the reaction product into a 1.5 ml tube. 

2. Malce a master mix 1 of 10 pi Milli-Q and 2 pi of 125 mM 
EDTA per reaction. 

3. Add 12 pi of master mix 1 to each reaction containing 10 pi of 
reaction. 

4. Ensure the contents are mixed. 

5 . Malce master mix 11 of 2 pi of 3 M NaOAc pH 4.6 and 50 pi of 
ethanol per reaction. 

6. Add 52 pi of master mix 11 to each reaction. 

7. Mix the contents well and incubate at room temperature for 
15 min. 

8. Spin at a speed of 12,000 for 20 min at room temperature. 

9. Decant the supernatant. 

10. Add 250 pi of 70 % and spin at 12,000x^for 10 min at room 
temperature. 

1 1 . Decant the supernatant. 

Add 12-15 pi of Hi-Di formamide, transfer to sample mbes cover 
with septa, denature, snap chill, and proceed for electrophoresis. 

19.3.19. Sequence The resulting ITS sequences were analyzed for homologies to 

Analysis sequences deposited in the GenBank and EMBL databases. 

Reference 

1. Applied Biosystems BigDye Terminator v3.1 
Cycle Sequencing Kit Protocol. 



19.3.18. PCR Product 
Cleanup 




Chapter 20 



Biological Sequence Analysis: Algorithms and Statistical 
Methods 

Suchi Smita, Krishna P. Singh, Bashir A. Akhoon, 
and Shaiiendra K. Gupta 

Abstract 

With the increase in huge amount of biological sequence data from large genome and proteome 
sequencing projects, efforts have been made to develop computational algorithms and databases to 
manage the information. This chapter is an attempt to highlight some of the commonly used algorithms 
for the biological sequence analysis ranging from pairwise sequence analysis, multiple sequence analysis, 
phylogenetic analysis, and prediction of the probability of a desired motif in the sequence. The chapter is 
organized in the form of basic questions that arise in the researchers’ mind and their step-by-step solution 
using important algorithms and statistical methods. The examples are used and elaborated in such a way 
that the algorithms can be easily understood by students with nonmathematical and nonstatistical 
background. 



20.1 Introduction 



A large number of computational algorithms and tools are available 
to decipher the information hidden in genome and proteome of any 
living organism. Alignment of the nucleotide and protein sequences 
is one of the basic protocols, in any of the genomics, proteomics, 
transcriptomics, and metabolomics project. The knowledge gener- 
ated out of sequence alignment can be used directly or indirectly in a 
variety of applications in molecular biology. 



20.2 Materials 



20.2.1. Pairwise 
Sequence Aiignment 
Using Dynamic 
Programming 



Dynamic programming (DP) is the method to solve the complex 
problem by breaking it down to smaller subproblems. In case of 
any sequencing project, it is crucial to identify the functional 
annotation of the given sequence which can be achieved by align- 
ing the sequence with a completely annotated sequence database. 
Considering the size of the nucleotide/protein sequences, the 
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20.2. 1. 1. Performing 
Global/Local Pairwise 
Aiignment for Given Set of 
Sequences 



20.2. 1.2. Construction and 
initialization of Matrix 



20.2. 1.3. Matrix Fiiiing 



alignment is not a straightforward task. In case of sequence align- 
ment, DP is the most common algorithm which is used for 
approximating the matching of two sequences. The individual 
positions of the sequences are scored by first using a scoring 
matrix and then followed by a trace back to come to the optimal 
alignment solution. 

Three important steps involved in dynamic programming are 
as follows: 

1 . Construction of matrix based on sequence size and setting up 
of initial conditions 

2. Tabular computation (matrix fill) 

3. Trace back through matrix for optimal alignment solution 

For the given two small sequences below, we will identify global 
and local alignment between the sequences using Needleman- 
Wunsch/Smith-Waterman dynamic programming algorithm 
approach [1]. 



Sequence 1 


AGTAGCTTCCAAA 


Sequence 2 


AGAGCTCACAA 


Score 


Match = +4; Mismatch = —4; Gap penalty = —2 



DP for the sequence alignment starts with the construction of a 
matrix based on sequence length. For the above sequences: 

Length of sequence 1 (m) = 13 and 
Length of sequence 2 ( w) = 1 1 

A matrix of [(13 + 1) x (11 + 1)] needs to be constructed in 
this case as given in Fig. 20.1 . The first row and first column of the 
matrix have to be initialized as multiple of gap penalty (—2). 

All the cells of the matrix are scored based on the following 
formula: 



Mij = max 



Af/— i,y— 1 T S{j 
Mi^ij + W 
+ W 



(Eq. 20.1) 



where My = maximum score set in the matrix at position f, j, 
Sij = Match/mismatch score, W = Gap penalty. 

To fill the value in the cell (2,2) based on (Eq. 20.1) (cell filled 
with gray color in Fig. 20.1): 



Mij = max 



0 + 4 

-2 + -2 
-2 + -2 



= 4, as shown in Fig. 20.2. 
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Fig. 20.1 Matrix construction and initialization for the dynamic programming. 
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Fig. 20.2 Matrix with the value calculated for the cell (2,2). 
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Fig. 20.3 Matrix with the calculated y value for each cell. 

All the cells of the matrix need to be filled in the same 
procedure. Figure 20.3 has all the matrix cells with the calculated 
value of Mij based on (Eq. 20.1). 

20.2.1.4. Trace Back for The trace back step determines the actual alignment(s) that result 

Optimal Alignment in the maximum score. There are likely to be multiple maximal 

alignments possible in the given sequences. Trace back starts from 
the last cell in the matrix. The alignment between the sequences is 
generated in the reverse order as we move from the last cell to the 
first cell in the matrix. 

For trace back, the reverse arrow is generated pointing 
towards the cell with maximum value. The direction of the arrow- 
head determines the gaps in the aligned sequences. A total of three 
moves of arrow positions are possible; 

1 . Arrow pointing towards diagonal cell: match/mismatch at the 
given position 

2. Arrow pointing towards upper cell: gap in the upper sequence 

3. Arrow pointing towards left cell: gap in the left sequence 

The optimal global alignment generated, based on the direc- 
tion of arrowhead in Fig. 20.4, between two given sequences is as 
follows: 
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Fig. 20.4 Trace back steps. The arrow indicates the optimal alignment path. 




The optimal global alignment score = 32. 

A local alignment was defined as the problem of finding the 
best alignment between substrings of both sequences. Smith and 
Waterman in 1981 showed that a local alignment can be com- 
puted using essentially the same idea employed by Needleman and 
Wunsch by slight modification in the calculation of matrix cell 
value [2]. The formula for computing the value is given below: 



Mij = max 



Mi_,j + W 

M'.j-i + W 

0 



(Eq. 20.2) 



From the above equation, it is clear that the value of any of the 
cell in the matrix will always be > 0. Based on the Smith and 
Waterman assumption, the first row and the column of the matrix 
are initialized with zero. The alignment matrix (Fig. 20.4) is 
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Fig. 20.6 Matrix with the optimal trace back path for the best local alignment between two sequences. 

recalculated based on (Eq. 20.2) to obtain the local alignment 
between the given sequences as shown in Fig. 20.5. 

Another important distinction is that the score of the best 
local alignment is the highest value found anywhere in the matrix. 
This position will be the starting point for trace back to retrieve an 
optimal alignment using the same procedure described for the 
global alignment case as shown in Fig. 20.6. However, the path 
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ends as soon as an entry with score zero is reached. It is trivial to 
see that the Smith-Waterman algorithm has the same time and 
space complexity as the Needleman-Wunsch algorithm. 

The local alignment between two sequences given in the 
problem based on Smith and Waterman assumption can be iden- 
tified as 



Sequence 


AG-AGCT- 


C A 


C A 


A 


II 1 1 1 1 1 III 


Sequence a 


AGTAGCTT 


C - 


C A 


A 



The best local alignment score = 34. 



20.2.2. Multiple 
Sequence Alignment 



Multiple sequence alignment is applied to a set of sequences that 
are assumed to be homologous (have a common ancestor 
sequence) with the goal to detect homologous residues and 
place them in the same column of the multiple alignment. To 
calculate evolutionary homology, multiple alignments are better 
suited than pairwise alignments, since if several sequences are 
compared simultaneously, the chance of random similarities 
occurring becomes much lower. Therefore multiple alignments 
can be used both for similarity as well as dissimilarity as in case of 
classify members of protein families and for phylogenetic analysis. 
Multiple alignment is also important for computing profiles, pre- 
diction of protein secondary structure, or computation of 
sequence motifs, etc. 

Assuming that we have established a family F of homologous 
protein sequences, 

where F = {Ai, M 2 ; : ; : ; A^j, 

now multiple sequence alignment can be used to predict if a 
new sequence Aq belong to the family F or not. One method is to 
align Aq to each of Mi; M 2 ; : : : ; M^ in turn. If one of these 
alignments produces a high score, then we may decide that Aq 
belongs to the family F. However, perhaps Aq does not align 
particularly well to any one specific family member, but may 
score well in a multiple alignment. 

Suppose we are given r sequences M^; i = I; : : : ; r over an 
alphabet T: 



Ml — («ii, 012, . . . , aim) 

M2 = (021, «22, • • • , a2n2) 



{ M, 



( 0^1 , 0|-2 , • • • , ) 



A multiple sequence alignment (msa) of M is obtained by 
inserting gaps (“-”) into the original sequences such that all result- 
ing sequences A* have equal length L > max [n^i = I,. . ., r). 
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A* = Ai after removal of all gaps from A * , and no column consists 
of gaps only: 



A{ = 


(®11> ®12> ■ • 


• 5 ®1l) 


A* = 


(®21’ ®22> ■ • 


• 5 ®2l) 


II 

* 


(a*i, 0*2, . . 


• ! Kl) 



In case of multiple sequence alignment, Needleman-Wunsch 
Algorithm could be generalized using scoring function as the sum 
of column scores: 



S{m) = T.iS{mi). 

Consider the alignment with three sequences; a three-dimen- 
sional matrix has to be generated as in case of two-dimensional 
matrix in pairwise alignment. Each cell in the matrix has at most 
seven neighboring cells and score for each cell in the matrix can be 
calculated as 




k 



= maxi 



+ 5(vi,Wj,Uk) 
Si-\j-i,k + §(vi,Wj,-) 
+ 5(vi, - ,Uk) 
+ 5(-,Wj,Uk) 
+ 5(vi, — , — ) 

5 (-,Wj,-) 

“f , 5 nk) 



(5(a:, y, z) is an entry in the 3-D scoring matrix 

For three sequences of length the run time is 7n^\0{n^)^ 
and for k sequences, total run time can be calculated as 
(2^- l)(w^)!0(2^»^). 

In case of multiple alignment, dynamic programming 
approach for alignment between two sequences is easily extended 
to k sequences, but it is impractical due to exponential running 
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time. For the given multiple sequence alignment between three 
sequences: 

x: AC-GCGG-C 
y: AC-GC-GAG 
z: GCCGC-GAG 

Following three pairwise alignments may be induced: 



Pair 1 


x:ACGCGG-C 




y:ACGC-GAC 


Pair 2 


x: AC-GCGG-C 




z: GCCGC-GAG 


Pair 3 


y:AC-GCGAG 




z:GCCGCGAG 



Thus from any given multiple alignment, we can infer pairwise 
alignments between all the sequences, but they may not be neces- 
sarily optimal. However, in many situations, it is difficult to infer a 
“good” multiple alignment from optimal pairwise alignments 
between all sequences. 

Since exact methods of multiple sequence alignment have 
exponential time complexity, heuristic approaches such as Pro- 
gressive Alignments are generally used which is based on the 
profile. Generation of profile for multiple alignments is a very 
rich representation of alignment. In a profile of multiple align- 
ment, we can show the information in each column and even the 
pairwise relationships in a profile. Here we can identify the relative 
frequencies of the alphabets and Gaps. Also we can optionally 
identify the different kinds of Gap events like Gap Open, Gap 
Close, and Gap Extension. 

Below are the multiple alignments between five sequences and 
the profile generated with the information available in the multi- 
ple alignments. 
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(continued) 
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20.2.2.1. Greedy Approach 
for Multiple Alignment 



T 


0.2 
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0.6 




0.2 
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0.4 0.4 
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0.4 
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0.8 






0.4 


0.4 



where A, C, G, T are nueleotides. 

O = Gap Open is defined to be a gap where previous character in 
the sequence is not a gap. 

C = Gap Close is defined to be a gap where next character in the 
sequence is not a gap. 

E = Gap Extension is defined to be a gap where both next and previous 
characters are gap. 

Profile representation is important to align sequence to sequence, 
sequence to a given profile, and profile to profile alignment. Eor 
the given two alignments: 




Both the alignment may be merged into a single multiple 
alignments by creating their corresponding profiles as below: 




Eor the given k sequences, the Greedy approach for multiple 
alignment is to choose the most similar pair of strings and com- 
bine them into a profile, thereby reducing alignment of k 
sequences to an alignment of k—1 sequences/profiles [3]. The 
step will be repeated until all the sequences are aligned. Below is 
the example of greedy approach: 
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k 



(U^ = ACGTACGTACGT. 
U2 = TTAATTAATTAA 
U3 = ACT ACT ACT ACT 



Vu„ = CCGGCCGGCCGG 



Hi — AGg/^TACgy^TACg/f-T ■■■ 
u, = TTAATTAATTAA 



In the example above, sequences Ui and % are merged as a 
profile because they represent the most similar pair of strings in 
the example. The more realistic example for the greedy approach 
is discussed below: 



20.2.3. To Perform 
Multiple Sequence 
Alignment for Given 
Set of Sequences 
Using Greedy 
Approach 



Consider these four sequences 

si = GATTCA 
s2 = GTCTGA 
s3 = GATATT 
s4 = GTCAGC 



Score: match 1, mismatch, and indels —1 

Total number of possible alignment pair can be calculated as 



Possible alignment = 

{n — 2)\ > 1 = 2! 

where n = total no. of sequences. 



20.2.3.1. To Perform the 
Pairwise Aiignment 
Between Alt the Four 
Sequences as Discussed in 
Protocoi 1 



Below is the pairwise alignment score between all the possible 
pairs: 





Pairwise alignment 


Alignment score 


Pair 1 


si G AT-TC A 


1 




S2G-TCTGA 




Pair 2 


si G AT-TC A 


1 




s3 G ATAT-T 


Pair 3 


si GATTCA-- 


0 




S4G-T-CAGC 




Pair 4 


S2G-TCTGA 


-1 




s3 G ATAT-T 



(continued) 
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Pairwise alignment 


Alignment score 


Pair 5 


s2 G T C T G A 


2 




s4 G T C A G C 




Pair 6 


s3 G AT-ATT 


-1 




S4G-TCAGC 





From the above pairwise alignment, Pair 5, representing 
sequences 2 and 4, has the maximum alignment score based on 
the scoring parameters given. 



20.2.3.2. Combining of the 
Two Sequences 



For multiple alignment using greedy approach, we can combine 
these two sequences as below: 



s2 G T C T G A 
s4 G T C A G C 



G T C-G-. 
a c 



Now we have a set of three sequences: 
si = GATTCA 
s3 = GATATT 
s2,4 = GTCt/aGa/c 



20.2.3.3. Designing of 
Pairs and Identification of 
Best Alignment 



We have to repeat the process for designing pairs and identifica- 



tion of best alignment between the pairs. Below is the optimal 
pairwise alignment between three sequences (si, s3, and s2,4) 




Pairwise alignment 


Alignment score 


Pair 1 


si GAT-TCA 
s3 GATAT-T 


1 


Pair 2 


si GATTC--A 
s2,4 G-T-CTGA 


0 


Pair 3 


s3 GATATT- 
s2,4 G - T C T G A 


-1 



From the next set of pairwise alignment, pair si and s3 has the 
maximum alignment score. 



20.2.3.4. Combining of 
Two Sequences to Get a 
New Profile 



We can combine sequences 1 and 3 to create a profile of si, 3. 



slGAT-TCA 

S3GATAT-T 



GATAT C-. 

t 



Finally, we have a set of two sequences: 

sl,3 = GATAT Ca/t 
s2,4 = G T C t/a G a/c 
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20.2.3.5. Final Multiple 
Sequence Alignment by 
Greedy Approach 



20.2.3.6. Progressive 
Alignment Method for 
Multiple Sequence 
Alignment 



which can be aligned as below with the alignment score of 1 based 
on given parameters: 




Based on greedy approach, the final multiple sequence alignment 
between the four sequences can be generated by considering a rule 
that “once a gap, always a gap.” 

si GAT-T C A 
s2 GATAT-A 
sSG-TCTGA 
S4G-TCAGC 

Another variation of greedy algorithm for the multiple sequence 
alignment is the progressive alignment which has somewhat more 
intelligent strategy for choosing the order of alignments between 
the given sequences. Progressive alignment works well for close 
sequences, but not suitable for the distantly related sequences. 
Progressive alignment uses profiles to compare sequences and the 
gaps in consensus string are permanent. It is one of the most 
common approaches for multiple sequence alignment. In general, 
this works by constructing a series of pairwise alignments, first 
starting with pairs of sequences and then followed by aligning 
sequences to existing alignments (profiles) and profiles to profiles. 

Progressive alignment is a heuristic and does not directly 
optimize any known global scoring function of alignment correct- 
ness. However, it is fast and efficient, and often provides reason- 
able results [4]. The progressive alignment results differ with 
the order in which the sequences are aligned. Also, generation of 
single or several profiles influence the result. Result may also vary 
with the scoring function used. Most of the multiple alignment 
tools are based on either the complete alignment or the pair- 
guided alignment method. 

In case of complete alignment, all sequences, from two sub- 
alignments, are used in a dynamic programming approach where 
sequences, from one subset, are aligned in a pairwise manner with 
all the sequences in another aligned subset. Alignment with the 
maximum score will determine how the sequence will be aligned 
to the group. Another approach is the pair-guided alignment, 
where two specific sequences are chosen from each group of 
sub-alignment. The alignment between these two sequences 
determines the final alignment between the groups. 

The most crucial step of the heuristic of progressive alignment 
algorithms is to determine the order of sequences to align first. 
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20.2.4. Performing 
Multiple Sequence 
Alignment for Given 
Set of Sequences 
Using Progressive 
Alignment Method 

20.2.4.1. Optimal Pairwise 
Alignment 



Clearly the most consistent alignments are those which align very 
similar pairs of sequences first. To determine which sequences are 
similar, most of the algorithms build a guide tree. The guide tree is 
a binary phylogenetic alignment tree, where the root node repre- 
sents the multiple alignment. The nodes furthest away from the 
root are the most similar pairs. The methods used to compute the 
guide tree are similar to the distance-based methods used for 
reconstructing phylogenetic trees, though they are often the 
rather “quick and dirty” ones. Feng and Doolittle in 1987 pub- 
lished the first progressive alignment algorithm which is based on 
the idea that a pair of sequences with minimal distance has also 
evolutionary diverged most recently [4]. Thus, for the optimal 
multiple sequence alignment, one has to follow the evolutionary 
path of the sequence. 

The Feng and Doolittle’s progressive multiple alignment 
approach is based on first the construction of all the pairwise 
alignment pair between the set of given sequences and calculation 
of the alignment scores which is then converted into a distance 
score. Based on the distance score, a distance matrix is generated 
that helps to construct a guide tree. In the guide tree, the approach 
is to start from the first node that was added to the tree and align 
the two children nodes (which may be two sequences, one 
sequence and one sub-alignment, or two sub-alignments). These 
steps are repeated for all other nodes in their tree order until one 
reaches the root, i.e., until all sequences have been aligned [5]. 

The distance between the alignment pair is the normalized 
percentage similarity, which is calculated as 



Similarity = 



Exact match in the alignment pair 
Total length of the alignment 



Consider the following five small sequences: 

si = ATTGCCATT 
s2 = ATGGCCATT 
s3 = ATCCAATTTT 
s4 = ATCTTCTT 
s5 = ATTGCCGATT 



Randomly select one sequence as the base sequence. Perform the 
optimal pairwise alignment with other sequences in the group 
For example, sequence si is selected as the base sequence. 



Pairwise alignment 


Pair 1 


si ATTGCCATT 
s2 ATGGCCATT 


Pair 2 


si ATTGCCATT - - 
s3 ATC - CAATTTT 



(continued) 
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20.2.4.2. Once a Gap 
Always a Gap 



20.2.4.3. To Obtain Best 
Atignment 

20.2.4.4. Construction of 
Phylogenetic Tree Using 
UPGMA Method 



Pair 3 


si ATTGCCATT 




s4 ATCTTC - TT 


Pair 4 


si ATTGCC - ATT 




s5 ATTGCCGATT 



Merge the pairwise alignment using the rule “once a gap, always a 
gap” as shown below: 

Merging pairwise alignment of pair 1 , 2 



si 


A 


T 


T 


G 


C 


c 


A 


T 


T 


- 


- 


s2 


A 


T 


G 


G 


C 


c 


A 


T 


T 


- 


- 


s3 


A 


T 


C 


- 


c 


A 


A 


T 


T 


T 


T 



Merging pair 3 alignment with the above alignment 



si 


A 


T 


T 


G 


C 


C 


A 


T 


T 


- 


- 


s2 


A 


T 


G 


G 


C 


C 


A 


T 


T 


- 


- 


s3 


A 


T 


C 


- 


c 


A 


A 


T 


T 


T 


T 


s4 


A 


T 


C 


T 


T 


C 


- 


T 


T 


- 


- 



Finally, merging pair 4 alignment with the above alignment. 
As alignment pair 4 has gap in si, the entire column will be shifted 
to incorporate gap. 



sl 


A 


T 


T 


G 


C 


C 


- 


A 


T 


T 


- 


- 


s2 


A 


T 


G 


G 


C 


c 


- 


A 


T 


T 


- 


- 


s3 


A 


T 


C 


- 


C 


A 


- 


A 


T 


T 


T 


T 


s4 


A 


T 


C 


T 


T 


C 


- 


- 


T 


T 


- 


- 


s5 


A 


T 


T 


G 


C 


C 


G 


A 


T 


T 


- 


- 



Choose another sequence as a base sequence and repeat step 1 and 
2 to return the best multiple alignment. 

Unweighted Pair Group Method with Arithmetic Mean is a sim- 
ple hierarchical clustering method used for the creation of phylo- 
genetic trees. UPGMA assumes a constant rate of evolution, and is 
not a well-regarded method for inferring phylogenetic trees unless 
this assumption has been tested and justified for the dataset being 
used. The algorithm iteratively joins the two nearest clusters, until 
only one cluster is left. 
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20.2.4.5. UPGMA Algorithm 



20.2.4.6. Initialization of 
Algorithm 



20.2.4.7. Iteration 



20.2.5. Construction of 
Phylogenetic Tree 
Using Neighbor Join 
Algorithm 



20.2.5. 1. Initialization of 
Algorithm 

20.2.5.2. Iteration 



Let d be the distance function between species; we define the 
distance Dij between two clusters of species Q and Cy as follows: 



A',y = 



1 



fli + Hj 



Y 

p^Ci q£Cj 



where Ui = | Q| and Wy = | Cy| 



1. Initialize n clusters with the given species, one species per 
cluster 

2. Set the size of each cluster to 1: ^ 1 

3. In the output tree L, assign a leaf for each species 



1 . Find the i and j that have the smallest distance Dij. 

2. Create a new cluster — (y), which has n^ij) = n-i + Wy members. 

3 . Connect i and j on the tree to a new node, which corresponds 
to the new cluster (?j), and give the two branches connecting i 
and y to {if) length A each. 

4. Compute the distance from the new cluster to all other clus- 
ters (except for i and j, which are no longer relevant) as a 
weighted average of the distances from its components: 



^ij),k = 



Hi -h flj 



D, 



i.k 



D, 



5 . Delete the columns and rows in D that correspond to clusters 
i and y, and add a column and row for cluster {if)., with 
computed as above. 

6. Return to I until there is only one cluster left. 

Neighbor joining is a bottom-up clustering method used for the 
construction of phylogenetic trees. The neighbor joining method 
is a greedy heuristic which joins at each step, the two closest sub- 
trees that are not already joined. It is based on the minimum 
evolution principle. One of the important concepts in the NJ 
method is nei£fhbors, which are defined as two taxa that are 
connected by a single node in an unrooted tree [6]. 

Same as initialization process of UPGMA 

For each species, compute 

k^i 

where Ui is the distance of node i from rest of the tree 
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1 . Choose the i and j for which Dij— u — Uj is smallest. 

2 . Join clusters i and j to a new cluster ( rj), with a corresponding 
node in T. Calculate the branch lengths from i and j to the 
new node as 

~ 2 2 ~ ^y>(y) ~ 2 2 ~ 

3. Compute the distances between the new cluster and each 
other cluster: 

J., — Dij 

= 2 

4. Delete clusters i and j from the tables, and replace them by 

(y)- 

5. If more than two nodes (clusters) remain, go back to 1. 
Otherwise, connect the two remaining nodes by a branch of 
length Dij. 



20.2.6. To Construct 



Consider four sequences: 



Phylogenetic Tree 
Using UPGMA and 
Neighbor Joining 
Methods 



A = ACTA 
B = ACTT 
C = CGTT 
D = AGAT 



20.2.6.1. Calculating 
Number of Mismatches 
(Distances) Between Two 
Sequences 





A 


B 


C 


B 


1 






C 


3 


2 




D 


3 


2 


2 



A, B is the most similar pair; it will be clustered together as 
follows: 




A 

B 



20.2.6.2. Calculation of 
New Distance Matrix 



New distance matrix will be calculated as 

dist (AB), C = (dist AC + dist BC)/2 = 2.5 
dist(AB), D= (dist AD + dist BD)/2 = 2.5 
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20.2.6.3. Calculation of 
Average Distance 



20.2.7. Construction of 
Phylogenetic Tree 
Using Neighbor 
Joining Method 



20.2.7. 1. Designing of a 
Base Tree 



Now CD is the most similar pair; it will be clustered together 




Final calculation is to take the average distance between two 
composite sets of sequences AB and CD. The average 
distance will be calculated as 

dist(AC + AD + BC + BD)/4 = 2.5 





AB 


CD 


2.5 



One-half of this distance 2.5/2 = 1.25 is included in the part 
of the tree that goes from the root to CD, and the other half goes 
from the root to AB. 

The best tree generated using UPGMA method 

A 
B 

C 
D 

We will generate the neighbor joining tree for the sequence data 
given in protocol 4. 

We can define the evolutionary distance between the 
sequences as follows: 




dij = - ln(l - pij) 

where dij is the distance between sequence i and j and is the 
fraction of mismatches in the pairwise alignment of sequences i 
and j. 

Design a base tree as follows: 
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20.2.7.2. Calculation of the 
Distance Between All 
Possible Pairs 



20.2.7.3. Calculation of the 
Distance 



20.2.7.4. Joining of the 
Sequence C and D to 
Create a Complete Tree 



20.2.8. Markov and 
Hidden Markov Model 



Calculate the distance between all the possible pair of four 
sequences using the formula: 

dij = - ln(l - pij) 

Construction of the distance matrix 





A 


B 


C 


D 


A 


0 








B 


0.28768 


0 






C 


1.38629 


0.69314 


0 




D 


1.38629 


0.69314 


0.69314 


0 



From the matrix, A and B have least distance; we will choose 
them as neighbor and join them as a new node “X.” 




Now, we will again calculate the distance 





AB 


C 


D 


AB 


0 






C 


1.039715 


0 




D 


1.039715 


0.69314 


0 



Distance of C from the node X can be calculated as 
i(C,AB) = [d{C, A) + d{C, B)]/2 



We will now join sequence C and D to create a complete tree 



The best three generated between the given sequences using 
neighbor joining method 



A D 




Markov model (MM) and hidden Markov model (HMM) are 
supervised machine learning techniques that have many applica- 
tions in bioinformatics such as sequence comparison, gene 
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20.2.8.1. HMM for Multiple 
Sequence Alignment 



20.2.8.2. Matching (M), 
Insert (I), and Delete (D) of 
the Multiple (MSA) 
Sequence Alignment 



20.2.8.3. Assumptions 
from the Multiple Sequence 
Alignment 



finding, homology modeling to identify known folds in a target 
sequence, phylogeny and functions prediction, etc. MM consists 
of several states from a system, which represent observations at a 
specific point in time, and a set of transition probabilities. HMM 
consists of a set of states just like MM along with hidden states that 
has probability distribution defined as emission probability. 

Generate the profile HMM for multiple sequence alignment. 
Consider the multiple alignment of five DNA sequences: 



T 


C 


- 


C 


T 


- 


- 


c 


A 


C 


T 


G 


T 


- 


- 


c 


T 


A 


T 


G 


- 


G 


C 


A 


C 


C 


- 


T 


T 


C 



In order to generate the profile HMM, we have to define match- 
ing (M), insert (I), and delete (D) states of the multiple (MSA) 
sequence alignment 

Column with more than half gap positions are considered as 
inserts. In the multiple sequence alignment given above, column 3 
will be the insert as only in one sequence there is a base, while 




column I, 2, 4, 5, 6 to be considered as the match states. The 
nucleotide in the match state will be defined, based on consensus 
generated after multiple alignment. 

A sample representation of HMM. The box represents match 
state, diamond represents the insertion in the sequence, and circle 
represents deletion 

From the multiple sequence alignment given above, following 
assumptions can be made: 

From Column 1 to 2 in the MSA 

3 transition to next match state 

I transition to delete state 
I transition to insertion state 
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20.2.8.4. The Hidden 
Markov Model 



From Column 2 to 3 in the MSA 

0 transition to next match state 

4 transition to delete state 

1 transition to insertion state 

From Column 3 to 4 and 4 to 5 in the MSA 

5 transition to next match state 

0 transition to delete state 

0 transition to insertion state 

From Column 5 to 6 in the MSA 

4 transition to next match state 

1 transition to delete state 

0 transition to insertion state 



The HMM for all the five sequences in the multiple sequence 
alignment can be represented as follows: 




HMM for sequence 1 




HMM for sequence 2 




HMM for sequence 3 





















324 



Smita, Singh, Akhoon, and Gupta 



20.2.8.5. Calculation of the 
Probability Matrix 



20.2.8.6. Calculation of 
Various Parameters of 
Profile HMM (Match 
Emissions/Insert 
Emissions/State 
Transitions) 




HMM for sequence 4 




HMM for sequence 5 

The transition path between various states in the models is 
represented as dark black arrow 



From HMM of all the individual sequences, the probability matrix 
can be calculated as 

1 2 3 4 5 6 



A 


0 


0 


1/5 


0 


0 


2/5 


T 


3/5 


0 


0 


1/5 


4/5 


0 


G 


0 


1/5 


0 


1/5 


0 


1/5 


C 


1/5 


3/5 


0 


3/5 


1/5 


1/5 


- 


1/5 


1/5 


4/5 


0 


0 


1/5 





0 


1 


2 


3 


4 


5 


A 


- 


0 


0 


0 


0 


2 



match X c -13 3 11 




Based on the probability matrix and the match emission, 
insert emissions, and state transition parameters, the overall 
HMM for multiple sequence alignment can be summarized as 
shown in Fig. 20.7. 
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Fig. 20.7 The HMM for the multiple sequence alignment of five sequences. The probability between various transitions 
states of HMM is mentioned. The square represent match emission, the diamond represents insert emission, and circle 
represents deletion. 



In many cases the Laplace correction has to be applied. Typi- 
cally Laplace’s rule (add one observation per symbol per column) 
is used to smooth distributions, along with relative frequency 
estimation. However, this model does not provide a mechanism 
to insert symbols between columns, which may be necessary. Such 
a mechanism is provided by a profile HMM and can be estimated 
from a given MSA, using relative frequency estimation and simple 
smoothing. 

For the multiple sequence alignment given above, the Laplace 
correction can be made in the profile as follows: 



Base 1 2 3 4 5 6 




20.2.8.7. Assigning 
Transition Probabiiity 



Probability Using Lapalace’s Rule from Column 1 is 
4/10 state M1M2 

2/10 state M1D2 
2/10 state MH2 
2/10 state D1D2 
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Probability Using Lapalace’s Rule from Column 2 is 
1/10 state M2M3 

5/10 state M2D3 
2/10 state M213 
2/10 state D2D3 

Probability of other column will be calculated in a similar 
fashion 


20.2.8.8. Assigning 
Emission Probability 


For the first column there are 3 T’s and 1 C and 1 gap. Using 
Lapalace’s rule, this will become 4T;2 C; lA; 1 G and 2 Gaps 
Thus die probabilities will be 4/10 T; 2/10 C; 1/10 A; 1/10 G 
and 2/10 for Gaps. Probabilities for otlier column can also assign in 
similar way. 


20.2.9. Markov Model 


Generate the probability matrix for coding and noncoding region 
in a gene using zero- and first-order Markov model. A fundamen- 
tal Markov model of a process is a model where every state 
corresponds to an observable event and the state transition prob- 
abilities depend just on the current and predecessor state. The 
Gene finding in DNA has become one of the most important 
computational biology problems that can be determined using 
Markov model [7]. A Markov chain is defined by a state space, 
which is the set of possible values that the Markov chain can take, 
and the transition probabilities of the process. 


20.2.9.1. K-Order Markov 
Model 


P{X) = p{xl,xl-i, Xi) 

= P{xl\xL-1, . ■ . , Xl).p{XL-l\XL-2,---, Xl) . . . p{xi) . 

Let as assume to design zero- and first-order Markov models for 
the coding and noncoding region of mtb48 gene of Mycobacte- 
rium tuberculosis (GenBank id: AY029285.1) from the position: 
207-1,583 and position: 1,591-2,138, respectively. 

For the coding sequence of mtb48 gene of M. tuberculosis 
(GenBank id: AY029285.1 position: 207-1,583) 

acgcagtcgcagaccgtgacggtggatcagcaagagattttgaacagggccaacgaggtggaggcc 

ccgatggcggacccaccgactgatgtccccatcacaccgtgcgaactcacggcggctaaaaacgccgc 

ccaacagctggtattgtccgccgacaacatgcgggaatacctggcggccggtgccaaagagcggcag 

cgtctggcgacctcgctgcgcaacgcggccaaggcgtatggcgaggttgatgaggaggctgcgacc 

gcgctggacaacgacggcgaaggaactgtgcaggcagaatcggccggggccgtcggaggggaca 

gttcggccgaactaaccgatacgccgagggtggccacggccggtgaacccaacttcatggatctcaaa 

gaagcggcaaggaagctcgaaacgggcgaccaaggcgcatcgctcgcgcactttgcggatgggtgg 

aacactttcaacctgacgctgcaaggcgacgtcaagcggttccgggggtttgacaactgggaaggcg 

atgcggctaccgcttgcgaggcttcgctcgatcaacaacggcaatggatactccacatggccaaattga 

gcgctgcgatggccaagcaggctcaatatgtcgcgcagctgcacgtgtgggctaggcgggaacatcc 

gacttatgaagacatagtcgggctcgaacggctttacgcggaaaacccttcggcccgcgaccaaattct 

cccggtgtacgcggagtatcagcagaggtcggagaaggtgctgaccgaatacaacaacaaggcagcc 

ctggaaccggtaaacccgccgaagcctccccccgccatcaagatcgacccgcccccgcctccgcaaga 

gcagggattgatccctggcttcctgatgccgccgtctgacggctccggtgtgactcccggtaccggga 
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20.2.9.2. For Zero-Order 
Markov Model 



20.2.9.3. For First-Order 
Markov Model 



tgccagccgcaccgatggttccgcctaccggatcgccgggtggtggcctcccggctgacacggcgg 

cgcagctgacgtcggctgggcgggaagccgcagcgctgtcgggcgacgtggcggtcaaagcggc 

atcgctcggtggcggtggaggcggcggggtgccgtcggcgccgttgggatccgcgatcgggggc 

gccgaatcggtgcggcccgctggcgctggtgacattgccggcttaggccagggaagggccggcgg 

cggcgccgcgctgggcggcggtggcatgggaatgccgatgggtgccgcgcatcagggacaaggg 

ggcgccaagtccaagggttctcagcaggaagacgaggcgctctacaccgaggatcgggcatggacc 

gaggccgtcattggtaaccgtcggcgccaggacagtaaggagtcgaag 

Total number of bases is 1,377 with the base count as follows: 
A: 281; C: 410; G: 488; T: 198 



= P{xl)-P{xl-i) ■ ■ • p{x\), 
where x = {A, G, T, C). 

In the zero-order Markov model, the probability distribution 
of the next base to be generated in the sequence does not depend 
on any of the base preceding it. 

Thus the zero-order probability matrix for the coding region 
of mtb48 gene of M. tuberculosis can be dehned as 

A C G T 

0.204 0.298 0.354 0.143 



P{X) = P{xl\xl_i).p{xl_i\xl_2) ■ ■ • ■p{X2\Xi).p{Xi). 



In the first-order Markov model, the probability distribution 
of the next base to be generated in the sequence depends on the 
base preceding it. 

where x = {A, G, T, C). 

p{xj\xj^^\) is the probability of observing Xj^ after Xl^\ in the 
sequence. The solution is to obtain the dinucleotide frequency 
data for the sequence as below: 



P{AA) 


0.27018 


P(AT) 


0.18246 


P(AG) 


0.26667 


P{AC) 


0.2807 


P{TA) 


0.11881 


P(TT) 


0.15347 


P(TG) 


0.40594 


P(TC) 


0.32178 


P(GA) 


0.20367 


P{GT) 


0.12016 


P(GG) 


0.35031 



(continued) 
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P(GC) 


0.32587 


P{CA) 


0.20048 


P{CT) 


0.14493 


P{CG) 


0.3913 


P{CC) 


0.26329 



The probability matrix can be generated based on the above 
dinucleotide frequency for the given sequence of mtb48 of 
M. tuberculosis. 

Probability matrix of the first-order Markov chain model for 
the coding region of mtb48 gene of M. tuberculosis 





A 


G 


T 


C 


A 


0.270463 


0.266904 


0.181495 


0.281139 


G 


0.203285 


0.351129 


0.119097 


0.326489 


T 


0.116162 


0.409091 


0.151515 


0.323232 


C 


0.200000 


0.392683 


0.143902 


0.263415 



For the noncoding sequence of mtb48 gene of M. tuberculosis 
(GenBankid: AY029285.1 position: 1,591-2,138) 

ggacgaattggacccgcatgtcgcccgggcgttgacgctggcggcgcgglttcagtcggccctagacggg 

acgctcaatcagatgaacaacggatccttccgcgccaccgacgaagccgagaccgtcgaagtgacgatcaat 

gggcaccagtggctcaccggcctgcgcatcgaagatggtttgctgaagaagctgggtgccgaggcggtgg 

ctcagcgggtcaacgaggcgctgcacaatgcgcaggccgcggcgtccgcgtataacgacgcggcgggcg 

agcagctgaccgctgcgttatcggccatgtcccgcgcgatgaacgaaggaatggcctaagcccattgttgcg 

gtggtagcgactacgcaccgaatgagcgccgcaatgcggtcattcagcgcgcccgacacggcgtgagtacg 

cattgtcaatgttttgacatggatcggccgggttcggagggcgccatagtcctggtcgccaatattgccgcag 

ctagctggtcttaggttcggttacgctggttaattatgacgtccgttacca 

Sequence length = 548 
Base count 



A: 107; C: 156; G: 183; T: 102 

Zero-order probability matrix for the noncoding region of the 



above mentioned 


sequence 


can be defined as 




A 


C 


G 


T 


0.195 


0.285 


0.334 


0.186 
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20.2.10. Determination For the Sequence X = AGTAGCTTCCAG 

of the Probabiiity 
of DNA Fragment 
“AGTAGCnCCAG” 
in the Coding Region 
of mtb48 Gene of 
M. Tuberculosis Using 
Probabiiity Matrix 
Generated in Protocoi 
20.2.9 



20.2. 10. 1. Zero-Order 
Markov Model Probability 
in the Coding Region of 
mtb48 Gene 



Zero-order Markov model probability in the coding region of 
mtb48 gene will be 



P(X) = P(A) * P( G) * P(T) * P(A) * P( G) * P(C) * P(T) * P(T) 
*P{C)*P{C)*P{A)*P{G) 

= 0.204*0.354*0.143*0.204*0.354*0.298*0.143*0.143 
*0.298*0.298*0.204*0.354 
P=2.91445E-08 



20.2. 10.2. First-Order 
Markov Model Probability 
in the Coding Region of 
mtb48 Gene 



First-order Markov model probability in the coding region of 
mtb48 gene will be 



P(X) = P(A) * P( AG) * P( GT) * P(TA) * P(AG) * P( GC) 
*P(CT)*P(TT)*P(TC)*(CC)*P(CA)*P(AG) 
= 0.204066*0.266904*0.119097*0.116162 
*0.266904*0.326489*0.143902*0.151515 
*0.323232*0.263415*0.20*0.266904 
P=6.50694E-09 



20.2.11. Determination 




— ®0n — 


0.5 


of the Most Likeiy Path 


a- 

nn 


= »cc = 


0.5 


for Sequence 




= 0.55 




“CGCGUACTTCAATG” 
in Frame 1 for the Zero 


cn 

a 

nc 


= 0.45 





Order HMM Derived for 
the mtb48 Gene in 
Protocoi 20.2.9.2, 

Using the Foliowing 
initiai Transition 
Probabiiities 
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20.2.11.1. Hidden State 
Transition Probabiiity 



20.2.11.2. Observable 
State Probability Matrix 
from the Zero-Order 
Markov Modet for Coding 
and Noncoding Region of 
mtb48 Gene as Generated 
in Protocoi 20.2.9.2 and 
20.2.9.3 

20.2.11.3. Starting 
Distribution (Provided in the 
Protocol) 



20.2.11.4. Determination 
of Most Likely Path Using 
Viterbi Algorithm 



Hidden state Transition probability (provided in the protocol) can 
be summarized as follows: 



To 


From 




A 


T 


G 


c 




Coding 


0.204 


0.143 


0.354 


0.298 




Noncoding 


0.195 


0.186 


0.334 


0.285 


Coding 








Noncoding 





0.5 0.5 





Coding 


Noncoding 


Coding 


0.5 


0.55 


Noncoding 


0.45 


0.5 



The most likely path for any of the sequence can be determined 
using Viterbi algorithm. The Viterbi algorithm is a computation- 
ally efficient method for determining the most apparent path 
taken through a Markov graph [8]. 

For the problem we have in our hand, the most likely path of 
the sequence CGCGTTACTTCAATG in frame 1 of the mtb48 
gene can be calculated based on the HMM as shown Fig. 20.8. 

The probability of the first base “C” in the sequence 
“CGCGTTACTTCAATG” to be in the coding/noncoding 
region can be calculated as 

Tcoding = P(C| coding) * Start coding 
= 0.298*0.5 
= 1.49 * 10-\ 

while the probability of “C” to be in the noncoding region can be 
calculated as 

-pNoncoding = P( C|noncoding) * Start noncoding 
= 0.285 *0.5 
= 1.425 * 10-1. 
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Fig. 20.8 Hidden Markov Model for the coding/noncoding region mapping. 



From the above two probabilities of the first base “C” to be in 
the coding or noncoding region, the probability is higher for the 
coding region. Thus it is likely that the given sequence will start 
from the coding region. 

For the second base, i.e., “G” in the sequence 

“CGCGTTACTTCAATG” to be in the coding and noncoding 
region, probability can be calculated as 



- Coding 



= max 



P(coding|coding)!rPw— l(coding), 
P(coding|noncoding) *Fn—\ (noncoding) 



=max{(0.5*0.298@0.45=t:0.285))}=t:0.354 
= 5 . 274 * 10 - 2 , 



*P(G| coding) 



^ noncoding 



= max< 



{ P(noncoding| coding) * Pn — 1 (coding) 1 
P(noncoding|noncoding) * Pn — l(noncoding) j 
* P(G|noncoding) 

= max{(0.55 * 0.298@0.5 * 0.285))} * 0.334 
= 5.474* 10-2. 



From the above two probabilities of the second base “G” to 
be in the coding or noncoding region, the probability is higher for 
the noncoding region. Thus it is likely that the second nucleotide 
of the given sequence will come from the noncoding region. 
Similarly, the probability of all the bases in the sequence from 
coding/noncoding region can be calculated. The probability cal- 
culated for all the positions is given in Table 20.1. The larger 
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Table 20.1 

The probability calculated for all the positions of the 
sequence CGCGTTACTTCAATG using Viterbi algorithm 





Coding 


Noncoding 


c 


0.149 


0.1425 


G 


0.052746 


0.0547426 


C 


0.052746 


0.0554895 


G 


0.052746 


0.0554895 


T 


0.025311 


0.0362142 


T 


0.0119691 


0.017298 


A 


0.0170748 


0.018135 


C 


0.030396 


0.031977 


T 


0.021307 


0.0304854 


T 


0.0119691 


0.017298 


C 


0.0249426 


0.026505 


A 


0.030396 


0.0319605 


A 


0.020808 


0.021879 


T 


0.014586 


0.0208692 


G 


0.0296298 


0.031062 



probability value among the coding and noncoding regions is 
highlighted as the yellow background and shows the most likely 
path. 

Thus, the most likely path of the given sequence can be 
identified as 

CGCGTTACTTCAATG 
c nc nc nc nc nc nc nc nc nc nc nc nc nc nc 
c = coding and nc = noncoding 
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Appendix 1 : Biosafety, GLP, and Biosecurity 



Each student and teachers who are a part of the laboratory should 
follow some guidelines strictly inside the laboratory for their own 
safety measure. The following guidelines have been provided to 
maintain a safe laboratory environment. It is the responsibility of 
each person who enters into the laboratory to understand the 
safety and health hazards associated with potential hazardous 
materials and equipment in the laboratory. It is also the indivi- 
dual’s responsibility to practice the following general safety guide- 
lines at all times: 



General Safety 


I. 


Always wear proper eye protection in chemical work, 


Guidelines 




handling, and storage areas. Contact lenses should normally 
not to be worn. 




2. 


Always wear appropriate protective clothing with a suitable 
lab coat or apron. 




3. 


Only closed toe shoes are to be worn in the laboratory. 
Sandals are not permitted. 




4. 


Confine long hair and loose clothing. 




5. 


Work areas or surfaces must be disinfected before and after 
use. 




6. 


Always wash hands and arms with soap and water before 
leaving the work area. This applies even if you have been 
wearing gloves. 




7. 


Never engage in horseplay, pranks, or other acts of mischief 
in biological work areas. 




8. 


Label all materials with your name, date, and any other 
applicable information. 




9. 


Do not pour biohazardous fluids down the sink. 
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Eating, Drinking, and 
Smoking 



Housekeeping and 
Maintainance 



10. Be familiar with the location of emergency equipment-fire 
alarm, fire extinguisher, emergency eyewash, and safety 
shower. Know the appropriate emergency response procedure. 

1 1 . Never mouth pipette chemicals when transferring solutions. 
Instead always use a pipet bulb to transfer solutions. 

12. Report any accident however minor immediately. 

Eating, drinking, smoking, gum chewing, applying cosmetics, and 
taking medicines in laboratories are strictly prohibited. 

1 . Food, beverages, cups, and other drinking and eating utensils 
should not be stored in areas where hazardous materials are 
stored or handled. 

2. Glassware used for laboratory operations should never be 
used to prepare or consume food or beverages. 

3. Laboratory refrigerators, ice chests, cold rooms, ovens, and so 
forth should not be used for food storage or preparation. 

4. Laboratory water sources and deionized water should not be 
used for drinking water. 

5. Laboratory materials should never be consumed or tasted. 

In the laboratory keeping things clean and organized can help 
provide a safer environment. Keep drawers and cabinet doors 
closed and electrical cords off the floor to avoid tripping hazards. 
Keep aisles clear of obstacles such as boxes, chemical containers, 
and other storage items that might be put there. Avoid slipping 
hazards by cleaning up spilled liquids promptly and by keeping the 
floor free of loose equipment such as stirring rods, glass beads, 
stoppers, and other such hazards. Never block or even partially 
block the path to an exit or to safety equipment, such as a safety 
shower or Are extinguishers. Use the required procedure for the 
proper disposal of chemical wastes and solvents. Supplies and 
laboratory equipment on shelves should have sufficient clearance 
so that, in case of a Are, the Are sprinkler heads are able to carry out 
their function. The work area should be kept clean and unclut- 
tered, with hazardous materials and equipment properly stored. 
Clean the work area upon completion of a task and at the end of 
the day. The custodial staff is only expected to perform routine 
duties such as cleaning the floor and emptying the general trash. 
In preparation for any maintenance service such as fumehood 
repair, plumbing, electrical, etc., the laboratory staff must prepare 
the laboratory before the maintenance personnel arrive. Whenever 
possible remove hazards that maintenance personnel may encoun- 
ter during their work activities. For example, infectious agents, 
radioactive materials, or chemicals must be moved to a secure area 
prior to initiation of maintenance work. 




Biohazard Waste 
Disposal 
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Dispose of items in the speciai receptacles as indicated 
below; 



Material 


Method of disposal 


Agar slants with biological 
materials 


Place tube upright in indicated test 
tube rack, but place caps in baskets 
as indicated 


Biological liquid (not in test tubes) 


Leave in container with closed cap 


Biological liquid in test tubes 


Place tube upright in indicated test 
tube rack, but place caps in baskets 
as indicated 


Broken glass (contaminated) 


Sharps container 


Broken glass (not contaminated) 


Broken glass container 


Cotton swabs (contaminated) 


Benchtop disinfectant/discard can 


Needles, glass slides, syringes, 
pipettes, other types of sharps 


Sharps container 


Noncontaminated paper 


Regular trash 


Petri dishes and contaminated 
solids (other than pipettes or 
swabs) 


Biohazard “orange/red bag” 
container 


Transfer pipettes (contaminated) 


Benchtop disinfectant/discard can 





Appendix 2: Address for Instruments and Chemicals 
Suppliers 



Instrument 

Manufacturers 



Bio-Rad (Asia Pacific) Pvt. Ltd, USA 

Bio-Rad Laboratories (Singapore) Pvt. Ltd 
27 International Business Park 
#01-02 iQuest@IBP 
Singapore 609924, Singapore 



PerkinElmer 

940 Winter Street 
Waltham, MA 02451, USA 

Olympus 

31 Gilby Road, Mount Waverely 
VIC 3149, Australia 
Tel: -^61-3-9265-5400 
Fax: -^61-3-9562-6438 
E-mail: Info@olympus.com.au 



Nikon Corporation Ltd, Japan 

5-21, Katsushima 1-chome 
Shinagawa-ku, Tokyo 140-0012, Japan 
Tel: -^81-3-5762-8911 



Labmate (Asia Pacific) Pvt. Ltd, UK 

Labmate (Asia) Pvt. Ltd 
Baid Mehta Complex, C-Block 
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183 Mount Road 
Chennai 600015, India 

UVP Ltd, UK 

Ultra-Violet Products Ltd 
Unit 1 , Trinity Hall Farm Estate 
Nuffield Road, Cambridge CB4 ITG, UK 



Millipore SAS, France 

Millipore S.A.S - Molsheim 
BP 116 

67124 Molsheim Cedex, France 



Tomy Digital Biology Co., Ltd, Tokyo 

3-14-17 Tagara, Nerima-ku 

Tokyo 179-0073, Japan 

E-mail ; info@digital- biology. co.jp 



Eppendorf AG., Germany 

Barkhausenweg 1 
22339 Hamburg, Germany 



Shimadzu Asia Pacific Pvt. Ltd, Singapore 

79 Science Park Drive 
#02-01/08 CintechlV 
Singapore Science Park 1 
Singapore 118264, Singapore 



Sonics & Materials Inc., USA 

Sonics Sc Materials Inc. 

53 Church Hill Road 
Newtown, CT 06470-1614, USA 
203.270.4600 - 800.745.1105 - 203.270.4610 



Applied Biosystem Internationals, USA 

Lingley House 
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120 Birchwood Boulevard 
Warrington, Cheshire WAS 7QH, UK 



Sartorius AG, Germany 

Sartorius AG Weender Landstr. 
94-108 37075 Goettingen, Germany 



New Brunswick Scientific Co Inc., USA 
Kerkenbos 11016546 
BC Nijmegen, Netherlands 
Tel: +31-24-3717-600 

Castel MAC SpA, Italy 

Via del Lavoro, 931033 

Castelfranco Veneto, Province of Treviso, Italy 
Tel: +39-0423738451 



Astec Co. Ltd, Japan 

Horiguchi Bldg., 2-19-7 
Iwamoto-cho, Chiyoda-ku 
Tokyo 101-0032, Japan 



Molecular Device Corporation, Singapore 

Molecular Devices, LLC 

1311 Orleans Drive 

Sunnyvale, CA 94089-1136, USA 



Thermo Fisher Scientific (Asheville), LLC 

308 Ridgefield Court Asheville 
NC 28806, USA 
Tel: +1-828-658-2711 



Scigenics Biotech 

35 (18), Vasudevapuram 
Triplicane, Chennai 600 005, India 
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Chemical Suppliers ^ 

l<ermentas Inc. 

798 Cromwell Park Drive 

Suites R-S 

Glen Burnie, MD, USA 

Tel: +1-800-3409026 

Fax: +1-800-4728322 

E-mail : fermentas .info@thermofisher.com 

Sigma-Aldrich Company Ltd 

The Old Brickyard 
New Road 
Gillingham 

Dorset, SP8 4XT, UK 

Roche Diagnostics Corp. 

9115 Hague Road 
Indianapolis, IN 46250, USA 
Tel: +1-317-521-2000 

Promega Corporation 

2800 Woods Hollow Road 
Madison, W1 53711, USA 
Tel: +1-608-274-4330 
Fax: +1-608-277-2516 

Qiagen Inc. 

27220 Turnberry Lane 
Valencia, CA 91355, USA 
Tel: +1-800-426-8157 
Fax: +1-800-718-2056 
Technical: 800-DNA-PREP 
Tel: +1-800-362-7737 

HiMedia Laboratories 

A-406, Bhaveshwar Plaza, LBS Marg 
Mumbai, Maharashtra 400086, India 
Tel: +91-22-25003747/0970 
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BD 

1 Becton Drive 

Franldin Lalces, NJ 07417, USA 
Tel: +1-201-847-6800 



Difco International B.V. 

Europaplein 30-S 
8916 HH Leeuwarden 
The Netherlands 

Merck & Co., Inc. Global Headquarters 

One Merck Drive 
P.O. Box 100 

Wliitehouse Station, NJ 08889-0100, USA 
Tel: +1-908-423-1000 



Oxoid Limited 

Wade Road 
Basingstoke 

Hampshire, RG24 8PW, UK 
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